Machine Learning Engineer Intern

Mountain View, California

GMI Cloud

About Us:

At GMI, we are at the forefront of scalable AI infrastructure solutions. Our platforms power state-of-the-art machine learning, enabling cutting-edge applications in the generative AI domain. As a fast-moving and innovative team, we thrive on leveraging open-source solutions and industry best practices to deliver robust, high-performance AI systems for our clients.

About the Role:

We are seeking a Software Engineering Intern who will focus on adapting and optimizing open-source foundation models for our GPU inference platform. You will work closely with experienced engineers and AI researchers, gaining hands-on exposure to large-scale model deployment techniques. This is an opportunity to build valuable skills in model optimization, GPU acceleration, and systems-level engineering while contributing to the next generation of AI-powered products.

Key Responsibilities:

Model Adaptation & Integration: Adapt open-source foundation models (e.g., LLMs, vision transformers, multimodal models) to run efficiently on our custom GPU inference infrastructure.
Performance Optimization: Identify bottlenecks in model inference pipelines, implement GPU kernels, and optimize code to reduce latency and improve throughput.
Platform Tooling & Automation: Develop scripts and tooling for automating model conversion, quantization, and configuration processes to streamline deployment workflows.
Testing & Validation: Implement benchmarking tests and validation suites to ensure model accuracy, reliability, and performance meet internal standards.
Collaboration with Cross-Functional Teams: Work closely with machine learning researchers, MLOps engineers, and infrastructure teams to refine performance strategies and ensure smooth integration of foundation models into production environments.
Documentation & Knowledge Sharing: Document adaptation procedures, best practices, and lessons learned. Contribute to internal knowledge bases and present findings in team meetings.

Qualifications:

Educational Background: Currently pursuing a Graduate degree in Computer Science, Electrical Engineering, or a related technical field.
Programming Skills: Proficiency in Python and familiarity with go and CUDA is a plus.
Foundational Knowledge in Machine Learning: Understanding of attention based models, PyTorch, and GPU-accelerated computing.
Problem-Solving Mindset: Strong analytical skills, with the ability to troubleshoot performance issues and propose innovative optimization strategies.
Team Player: Excellent communication skills, eagerness to learn, and the ability to collaborate effectively with diverse teams.

What You'll Gain:

Real-world exposure to large-scale, production-grade AI deployments.
Hands-on experience with state-of-the-art models and GPU acceleration techniques.
Mentorship from experienced engineers and researchers.
Opportunities to impact performance-critical aspects of cutting-edge AI products.

If you're passionate about AI systems engineering and excited to work at the intersection of machine learning and high-performance computing, we encourage you to apply.

Date Posted: 02 May 2025

Apply for this Job

Show me similar jobs

Send me jobs by email