DevOps Engineer
12-month contract to hire
Hybrid: 4x a week onsite in Plano, TX
We are seeking a DevOps Engineer with strong experience in cloud-native architecture, infrastructure automation, and platform reliability. This individual will play a key role in designing, building, and maintaining scalable infrastructure solutions with a focus on performance, security, and automation.
Core Responsibilities
- Implement cloud-native infrastructure and applications in AWS
- Drive infrastructure automation using Terraform or CloudFormation
- Build and maintain containerized environments with Docker and Kubernetes
- Participate in BCP/DR testing and planning for high availability and fault tolerance
- Review infrastructure code and automate operational procedures with runbooks and scripts
- Design and support large-scale, distributed systems with a focus on uptime and performance
- Lead implementation of observability and monitoring tooling (e.g., CloudWatch, Datadog)
- Collaborate across teams to guide security, compliance, and infrastructure governance
Required Skills & Experience
- 8+ years of experience in DevOps, SRE, or Platform Engineering
- Strong hands-on experience with AWS services (EC2, S3, Lambda, CloudWatch, Aurora Global DB)
- 3+ years of experience with Infrastructure as Code (Terraform, CloudFormation)
- 3+ years of containerization experience with Kubernetes and Docker
- Proficient in scripting and infrastructure programming (Python, Bash, etc.)
- Strong grasp of networking fundamentals and protocols (TCP/IP, DNS, HTTP, SMTP)
- Experience with cloud architecture patterns, cost models, and operational practices
- Solid understanding of CI/CD, version control, and automation strategies
- Proven experience in large-scale, always-on production environments
Preferred / Desired Skills
- Familiarity with compliance, security, and governance best practices (IAM, data protection, segregation of duties)
- Experience with zero-downtime deployment strategies (Blue/Green, Canary)
- Troubleshooting and root cause analysis across application and infrastructure layers
- Performance tuning across compute, storage, CDN, and API layers
- Chaos Engineering or Failure Injection experience is a plus
- Experience applying SRE principles in production environments
- AWS or Azure certifications are a bonus
- Ability to lead by example, mentor peers, and share operational best practices