Apply for this Job
Job Description: - Working closely with a wide range of container automation tooling such as Kubernetes and AWS EKS
- Design, implement, and maintain a secure scalable compute platform as it evolves with the industry
- Champion SRE methodologies around monitoring, alerting, and establishing SLOs, SLAs
- Identify and execute on opportunities to optimize existing systems, improve infrastructure and eliminate work through automation
- Work alongside other teams in helping provide post mortem analysis of why services broke or became degraded.
- Design and build automation suites to streamline operational support.
- Good understanding of CNCF tools like ArgoCD, Crossplane and Kyverno
- Established understanding of observability fundamentals (Logging, Metrics, Tracing)
- Ability to learn quickly, master our existing systems and identify areas of improvement
- Have a strong technical background and ability to think creatively to solve problems.
- Acquainted with Kubernetes Operators, Controllers and CRDs functionalities
- Participate in our on-call rotation for production services we build
- Deep understanding and application of computer science fundamentals: data structures, algorithms, and design patterns.
- You have exposure to and understanding of cloud (AWS, Google Cloud, Azure, etc.) architectures/services.
- Excellent understanding of Multi cluster management, operating at Scale
- Established understanding of observability fundamentals (Logging, Metrics, Tracing)
Date Posted: 01 April 2025
Apply for this Job