Sr Site Reliability Engineer

Atlanta, Georgia

AgreeYa Solutions
Job Expired - Click here to search for similar jobs

Job Description:

  • Lead and mentor a team of SREs, fostering a culture of collaboration, continuous learning, and operational excellence. Drive the adoption of SRE best practices and ensure adherence to reliability and performance standards.
  • Design and implement highly available, scalable, and fault-tolerant systems using AWS.
  • Collaborate with software engineering teams and other SREs to influence design and architecture decisions to improve system reliability and performance.
  • Develop and maintain automation scripts and tools to streamline operations, deployments, and monitoring processes.
  • Utilize Infrastructure as Code (IaC) tools such as Terraform, GitHub Actions, and CloudFormation to manage infrastructure. Implement and maintain robust monitoring, alerting, and logging systems using tools like Splunk, Grafana, or New Relic.
  • Lead incident response efforts, conduct root cause analysis, and implement measures to prevent recurrence.
  • Oversee the design and maintenance of CI/CD pipelines using tools like Jenkins, GitLab CI, or CircleCI. Ensure seamless and efficient code deployment processes, reducing time to market and increasing system reliability.
  • Conduct performance tuning and capacity planning to ensure systems can handle growing workloads. Troubleshooting experience. Identify and resolve performance bottlenecks in infrastructure and applications.
Sr. Site Reliability Engineer
Date Posted: 17 May 2024
Job Expired - Click here to search for similar jobs