Software Engineer

Boston, Massachusetts

Acceler8 Talent

Software Engineer

We're hiring a Software Engineer to lead efforts in optimizing and deploying machine learning inference across a range of modern hardware platforms. This role is critical as we expand the reach of our AI models into real-world applications requiring high-throughput and low-latency performance.

You'll be joining a team focused on building and shipping high-performance AI systems, with a strong emphasis on practical deployment and optimization. Our engineers work close to the hardware, blending systems expertise with machine learning fluency to deliver reliable and efficient solutions.

The Software Engineer will take ownership of optimizing inference pathways across GPU, CPU, and emerging accelerator platforms. You'll be expected to work independently, experiment with performance techniques, and interface closely with model and systems teams. The work is technical, hands-on, and has direct impact.

What we can offer you

Deep ownership over core inference systems
Collaboration with experts in systems, ML, and compiler optimization
Access to a wide range of hardware accelerators and software stacks
A tight feedback loop from experimentation to production
Clear, technical impact on real-world AI deployment
Competitive salary and equity package

Key responsibilities

Optimize inference stacks for GPU, CPU, and NPU architectures
Build and maintain performant inference pipelines using CUDA, C , and Triton
Interface with Python/PyTorch-based ML models to ensure smooth deployment
Tune low-level primitives for maximum hardware utilization
Deliver end-to-end optimized inference setups with minimal supervision
Stay current with developments in quantization, decoding strategies, and model execution
Improve throughput, minimize latency, and adapt solutions across diverse environments

Relevant

Date Posted: 24 April 2025

Apply for this Job

Show me similar jobs

Send me jobs by email