Site Reliability Engineer

Cincinnati, Ohio

Federal Home Loan Bank
Apply for this Job
Site Reliability Engineer

General Summary:

Site
reliability engineers (SREs) are responsible for improving system reliability
and resilience to make it faster and easier to develop and deploy new software
capabilities. SREs focus especially on building automation to reduce manual
effort and prevent operating incidents.

Principal Duties and
Responsibilities:
  • Works with stakeholders such as product owners to define service level objectives (SLOs) for system operations. Track performance against SLOs in partnership with monitoring teams or other stakeholders, and ensure systems continue to meet SLOs over time.
  • Collaborates with software developers, engineers, and operations teams on opportunities to improve performance and stability of applications and systems.
  • Creates dashboards and reports to communicate key metrics.
  • Performs updates to application software to improve performance, scalability, and stability of systems.
  • Designs, codes, tests, and delivers software to automate manual operational work.
  • Participates in operational support and on-call rotation shifts for supported systems and products.
  • Performs analytics on previous incidents to understand root causes and better predict and prevent future issues.
  • Identifies, evaluates, and recommends monitoring tools and diagnostic techniques to improve system observability.
  • Remains current on site reliability engineering methods and trends such as observability-driven development and chaos engineering. Drive continuous improvement in software quality and infrastructure reliability and resilience.
  • Oversees, design, implement, and manage DevOps capabilities using continuous integration/continuous delivery toolsets and automation.
  • Understands and implements the governance, assurance and standards activities associated with FHLB policies and procedures.
  • Performs other duties as needed to support the team and the business.
Minimum Knowledge,
Skills and Abilities Required:
  • Knowledge at a level normally acquired through completion of a Bachelor's Degree in Computer Science, Information Technology, or a related study, or 4 years equivalent experience.
  • Ability to collaborate in a team environment, and able to adapt effectively and quickly to a rapidly changing highly regulated environment.
  • Advanced analytical and problem-solving skills to identify research and resolve server problems effectively and efficiently.
  • Strong verbal and written communication skills.
  • 3+ years of experience with programming and scripting languages (e.g. Java, C , C , Python, Bash, PowerShell).
  • 3+ years of experience with incident and response management.
  • Exposure to Agile and DevOps development methodologies.
  • Experience with working in cloud ecosystems, preferably Microsoft Azure.
  • Exposure to monitoring and observability tools (e.g. Dynatrace, Splunk, Cloudwatch, NewRelic, ELK, Prometheus, OpenTelemetry).
  • Exposure to configuration management systems (e.g. Puppet, Ansible, Chef, Salt, Terraform).
  • Exposure to continuous integration/continuous deployment tools (e.g. Git, Teamcity, Jenkins, Artifactory).
  • Demonstrates a commitment to diversity and inclusion. Promotes an environment of empathy and respect, ensures the inclusion of all team members, and will actively engage in D&I events and learning opportunities.
Working Conditions:

Requires
daily interaction with system and networking hardware and software using
PCs/Servers for majority of duties. Exposed to moderate noise volume when working in the server room. Requires lifting and moving equipment of
approximately 30 lbs. to move or install switches, routers or related equipment
and configurations. Must be able to
quickly respond to problems that affect production up time, occasionally
requiring work outside normal Bank hours (i.e. weekends, evenings or early
mornings).

Notation: This position has been
identified as "high risk" as outlined in the Bank's Background Check policy.
Individuals occupying this position will be required to submit to a background
check biennially. Such repeat background
check(s) are considered a "condition of continued employment".


Date Posted: 16 May 2025
Apply for this Job