Do you have experience in building scalable, cloud-based systems from the ground up, then this is the role for you.
As part of the Infrastructure Engineering team, you will design large-scale backend systems, implement scalable infrastructure to support large-scale Generative AI (GenAI) workloads, optimize systems for performance, and drive the development of GenAI infrastructure.
Required Qualifications: Education Experience: Bachelor's degree in computer science, Computer Engineering, or relevant technical field, or equivalent practical experience.
Professional Experience: A proven track record as a technical leader in developing scalable, maintainable, and high-performance cloud-native infrastructure. Hands-on involvement in the architecture, development, and deployment of reliable infrastructure. 8+ years of experience leading, mentoring, or managing software engineers.
Core Technical Skills: - Practical Expertise in one or more cloud platforms like AWS, Azure, or GCP including scaling with Docker and Kubernetes.
- Deep understanding of network architecture, protocols, and security best practices
- Proficiency in Infrastructure as Code (IaC) tools like Terraform, CloudFormation
- Experience with containerization technologies like Docker, Kubernetes
- Familiarity with scripting languages like Python, Bash
- Experience with service mesh (Linkerd, Istio) and API gateways (Kong, Traefik) to enhance microservices deployment and management.
- Ability to tackle complex problems, make data-driven decisions, and apply best practices in secure software development.
Preferred Qualifications: - 10+ years of hands-on experience building and maintaining high performance scalable infrastructure
- Experience developing tools, libraries, and infrastructure for data preprocessing, model training/finetuning, and deployment of LLMs in production environments.
- Proficiency in orchestration frameworks like Flyte, MLFlow or similar technologies for automating complex workflows