Title: Cloud Infrastructure/ AI/ML Engineer
Duration: 12 Months+ Possible Extension
Location: Waltham, MA
Manager Notes:
Must have experience: - Minimum 3+ years of experience in cloud infrastructure, AI/ML development, and bioinformatics pipeline management
- MS or PHD Degree
- Demonstrated expertise with various computing platforms (HPC high performance computing, AWS , 7B, DNAnexus, AWS Sagemaker AI)
- Cloud computing platforms (AWS, GCP) - Must have requirement AWS,similiar
- Python and Bash (must have)
- Communications, strong communication
- Experience developing/deploying AI/ML frameworks
Nice to Have/ Preferred experience: - Containerization technologies (Docker, Kubernetes)
- S3 AWS Storage, data storage, nice to have
- Experience with NGS data analysis workflows and automation (Snakemake,Nextflow)
Position Overview We are seeking an experienced Cloud Infrastructure/ AI/ML/ Data Engineer to support variety of projects across Genomic Medicine Unit (GMU) research and platform work. This contractor role focuses on providing infrastructure solutions to enable AI/ML models developments and applications. As such, the position will require pipelines execution, environment management, AI/ML model development/deployment, management of data for various bioinformatics workflows.
Required Experience • 3+ years of experience in cloud infrastructure, AI/ML development, and bioinformatics pipeline management
• Advanced degree (MS or PhD) in Bioinformatics, Computational Biology, Computer Science or related field
• Demonstrated expertise with various computing platforms (HPC, 7B, DNAnexus, AWS Sagemaker AI)
• Strong background in NGS data analysis and pipeline automation
Key Responsibilities • Pipeline Execution & Management: Run and maintain bioinformatics pipelines on cloud platforms
• Environment Management: Configure and support data/pipeline environments using 7B, and DNAnexus
• AI Infrastructure: Set up and maintain environments for AI model development and inference (AWS Sagemaker AI)
• Data Visualization: Create intuitive visualization tools for bench scientists and present data effectively
• LLM Development: Develop RAG (Retrieval-Augmented Generation) LLMs for Gene Therapy GMU use cases
• Automation: Develop and deploy agents to optimize/run routine NGS analyses and automate metadata verification
Technical Skills: • Cloud computing platforms (AWS, GCP)
• Containerization technologies (Docker, Kubernetes)
• Programming languages (Python, R, Bash)
• Experience with NGS data analysis workflows and automation (Snakemake, Nextflow)
• Experience developing/deploying AI/ML frameworks
• Data visualization tools and libraries