Developer V

Reston, Virginia

Mindlance

Overall years of experience:

• 8+ years of related experience in their specific area with experience leading teams on projects with similar scope and complexity.

• Bachelor's or master's degree in computer science or equivalent.

• Certifications: AWS Solutions Architect, Agile Certified Practitioner (ACP), or relevant cloud certifications.

Job Description:

We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in cloud platforms, DevOps practices, and modern software development frameworks. The SRE will play a critical role in designing, building, and maintaining highly scalable, fault-tolerant, and secure cloud infrastructure while ensuring operational excellence, high availability, and reliability.
Key Responsibilities:
1. Cloud Infrastructure & Automation:

• Design, implement, and manage cloud-based infrastructure using platforms like AWS, Azure, or GCP.

• Utilize Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, and Ansible to automate deployments and configurations.

• Create robust automation targeted at anomaly detection, toil reduction, recovery processes, and self-healing mechanisms, and optimize cloud costs.

2. DevSecOps & CI/CD:

• Deep understanding of DevSecOps principles and CI/CD pipelines using tools like GitLab, Jenkins, SonarQube, Nexus/Artifactory, and Docker.

• Implement security best practices, including IAM roles, RBAC, vulnerability remediation, and SAST/DAST/SCA tools.

3. Observability & Incident Management:

• Design and implement monitoring, logging, and distributed tracing solutions using tools like AWS CloudWatch, Splunk/SignalFX, Dynatrace, and OpenTelemetry.

• Lead root cause analysis, blameless postmortems, and proactive incident management to minimize MTTR and MTTD.

• Define and monitor SLOs, SLIs, and error budgets to ensure system reliability.

4. Microservices & API Management:

• Architect and manage microservices, serverless computing, and RESTful APIs.

• Ensure fault tolerance and resilience using design patterns like Circuit Breaker, Retry, Timeout, and Bulkhead.

5. Chaos Engineering & Resiliency:

• Conduct chaos engineering experiments using tools like AWS FIS and Chaos Toolkit.

• Perform resiliency assessments using Resilience Hub and implement self-healing solutions.

6. Database & Application Support:

• Manage and optimize database technologies such as PostgreSQL, MongoDB, DynamoDB, Oracle, and Redshift.

• Provide production support, including incident response, problem management, and runbook creation. Participate in on-call rotations.

7. Collaboration & Communication:

• Collaborate with cross-functional teams to implement shift-left testing practices (BDD, TDD, Unit, Regression).

• Create and maintain architecture diagrams, knowledge articles, and disaster recovery plans.

• Communicate effectively with stakeholders and demonstrate strong relationship management skills.

Required Skills & Qualifications:

• Expertise in cloud platforms (AWS, Azure, or GCP) and container orchestration.

• Proficiency in programming/scripting languages such as Python, Java, Node.js, Bash, and PowerShell.

• Strong knowledge of database technologies (e.g., PostgreSQL, MongoDB, DynamoDB, Oracle, Redshift).

• Experience with DevOps tools (Jenkins, Docker, Nexus/Artifactory) and build tools (Maven, Gradle).

• Familiarity with AI/ML integrations, event-driven architectures, and distributed systems.

• Expertise in observability, logging, and monitoring tools (AWS CloudWatch, Splunk, Dynatrace, OpenTelemetry).

• Strong understanding of security practices, including IAM, RBAC, and vulnerability management.

• Experience with chaos engineering, resiliency assessments, and disaster recovery planning.

• Proficiency in performance testing tools (JMeter, LoadRunner) and capacity planning.

• Excellent verbal and written communication skills, with the ability to collaborate across teams.

Preferred Qualifications:

• Experience with AI/ML libraries (e.g., NLTK, Transformers, Spacy, SciPy), Amazon SageMaker, and GenAI tools.

• Familiarity with project management tools like JIRA, Confluence, and ServiceNow.
Knowledge of utilities like AWS CLI, POSTMAN, and curl.

EEO:

"Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of - Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans."

Date Posted: 23 April 2025

Apply for this Job

Show me similar jobs

Send me jobs by email