Software Engineer

Boston, Massachusetts

Acceler8 Talent
Apply for this Job

Software Engineer


We're hiring a Software Engineer to lead efforts in optimizing and deploying machine learning inference across a range of modern hardware platforms. This role is critical as we expand the reach of our AI models into real-world applications requiring high-throughput and low-latency performance.


You'll be joining a team focused on building and shipping high-performance AI systems, with a strong emphasis on practical deployment and optimization. Our engineers work close to the hardware, blending systems expertise with machine learning fluency to deliver reliable and efficient solutions.


The Software Engineer will take ownership of optimizing inference pathways across GPU, CPU, and emerging accelerator platforms. You'll be expected to work independently, experiment with performance techniques, and interface closely with model and systems teams. The work is technical, hands-on, and has direct impact.


What we can offer you

  • Deep ownership over core inference systems
  • Collaboration with experts in systems, ML, and compiler optimization
  • Access to a wide range of hardware accelerators and software stacks
  • A tight feedback loop from experimentation to production
  • Clear, technical impact on real-world AI deployment
  • Competitive salary and equity package
  • Key responsibilities

    • Optimize inference stacks for GPU, CPU, and NPU architectures
    • Build and maintain performant inference pipelines using CUDA, C , and Triton
    • Interface with Python/PyTorch-based ML models to ensure smooth deployment
    • Tune low-level primitives for maximum hardware utilization
    • Deliver end-to-end optimized inference setups with minimal supervision
    • Stay current with developments in quantization, decoding strategies, and model execution
    • Improve throughput, minimize latency, and adapt solutions across diverse environments

    Relevant

    Date Posted: 24 April 2025
    Apply for this Job