ML Engineer, Generative AI Innovation Center
The Generative AI Innovation Center at AWS empowers customers to harness state of the art AI technologies for transformative business opportunities. Our multidisciplinary team of strategists, scientists, engineers, and architects collaborates with customers across industries to fine-tune and deploy customized generative AI applications at scale. Additionally, we work closely with foundational model providers to optimize AI models for Amazon Silicon, enhancing performance and efficiency.
As a Senior ML Engineer on our team, you will work with clients, partners, and other AWS teams to drive the development of custom Large Language Models (LLMs) across languages, domains, and modalities. You will be responsible for fine-tuning state-of-the-art LLMs for diverse use cases while optimizing models for high-performance deployment on AWS's custom AI accelerators. This role offers an opportunity to innovate at the forefront of AI, tackling end-to-end LLM training pipelines at massive scale and delivering next-generation AI solutions for top AWS clients.
Key Job Responsibilities
- Large-Scale Training Pipelines: Design and implement distributed training pipelines for LLMs using tools such as Fully Sharded Data Parallel (FSDP) and DeepSpeed, ensuring scalability and efficiency.
- LLM Customization & Fine-Tuning: Adapt LLMs for new languages, domains, and vision applications through continued pre-training, fine-tuning, and Reinforcement Learning with Human Feedback (RLHF).
- Model Optimization on AWS Silicon: Optimize AI models for deployment on AWS Inferentia and Trainium, leveraging the AWS Neuron SDK and developing custom kernels for enhanced performance.
- Customer Collaboration: Interact with enterprise customers and foundational model providers to understand their business and technical challenges, co-developing tailored generative AI solutions.
- Define path to production for generative AI solutions and implement large scale production generative AI solutions.
Minimum Qualifications
- 3+ years of non-internship professional software development experience.
- 2+ years of non-internship design or architecture (design patterns, reliability, and scaling) of new and existing systems experience.
- Experience programming with at least one software programming language.
- Hands-on experience with deep learning and/or machine learning methods (e.g., for training, fine-tuning, and inference).
- Hands-on experience with generative AI technology.
Preferred Qualifications
- Bachelor's degree in Computer Science or equivalent.
- Hands-on experience with at least one ML library or framework.
- 2+ years of professional experience in developing, deploying, or optimizing ML models.
- 2+ years of professional experience in the full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations.
Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify, and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice to know more about how we collect, use, and transfer the personal data of our candidates.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.