Senior Site Reliability Engineer at Cookpad LimitedPosted on: 03/27/2021
Location: Bristol (ON-SITE)
Tags: azure memcached ecs ruby chef grafana gcp jenkins github prometheus redis elasticsearch dynamodb terraform kubernetes aws mysql kafka
As an SRE Engineer at Cookpad you will have a lot of opportunities to innovate and explore new technologies. The SRE team is in the early stages of building a new platform based on Kubernetes for running Cookpad’s production services and you’ll have a lot of say on its design and implementation. We love solving problems using open source technologies and we contribute back to the community by open sourcing some of our tools. In particular, you will: * Guarantee the reliability of Cookpad’s services. * Provide platforms and tooling that enable shipping to production easily and reliably. * Mitigate incidents as part of our blameless post mortem culture and build solutions and automation to prevent them from happening again. * Provide support to a global and diverse organization working in different countries. **Requirements** This is a senior level role and we are looking for the following skills and experience: * Experience in software engineering and automation in one or more modern programming languages. * Experience in containerization and deploying applications to Kubernetes, ECS or other container schedulers. * Familiar with at least one Cloud environment, for example, AWS, GCP, or Azure. * Comfortable managing software in a Linux based environment. In the SRE team we use the following technologies: Programming language: Ruby and Go for scripting and building tools. Infrastructure-as-Code: Terraform, Jsonnet and Itamae (our own lightweight Chef). Cloud provider: AWS. Container schedulers: Kubernetes and AWS ECS. CI/CD: AWS CodeBuild, Github Actions and Jenkins. Observability: Grafana, Prometheus, Thanos, Alertmanager, Elasticsearch and Amazon CloudWatch. Data stores: MySQL, Redis, Memcached and DynamoDB. Event Streaming: Kafka.