Site Reliability Engineer (New York, NY)

Share this job

Site Reliability Engineer

New York, NY

Description

Summary of requirements

We are looking for hands-on SRE experts with strong experience in a large technology company that has applied SRE concepts and practices at scale. Key requirements:

• Minimum 5 years experience as a Senior Engineer in SRE

• Minimum 2 years experience in one of the following companies: Google, Uber, Lyft, LinkedIn, SoundCloud, Twitter

• Contributions to Open Source Software in SRE space (e.g. Prometheus, Kubernetes, Terraform, Ansible)

• Experience with Containers and container orchestration (Docker, Kubernetes)

• Expertise in monitoring and metrics (Datadog, Prometheus, New Relic)

• Familiar with IAC / infrastructure automation (Terraform, Ansible)

• Comfort with databases and in-memory key/value stores (MSSQL, Postgres, Redis, MongoDB)

• Solid knowledge of Linux/UNIX and networking fundamentals

• Proficient in at least one of these languages: Python, Java

• You're comfortable building and operating infrastructure that employs a Chaos Monkey. Bonus: You've written that Chaos Monkey service

• Deep experience analyzing performance, end to end service experience and overall system health.

Apply for this job