Summary: In this role, you will be a member of the PyTorch Core Systems team. The PyTorch team develops the open source software stack powering AI models and systems. The Systems team optimizes highly performant software to train and serve AI architectures. You will work closely with AI researchers to analyze deep learning models and optimize their performance within PyTorch. You will also partner with researchers to understand modern advances in AI guided software development and apply this directly towards PyTorch code and device optimization. Examples of projects include: Rewriting core collectives to introduce fault tolerance with RDMA and GPUDirect, allowing training to continue even when nodes fail. Building a custom Python bytecode interpreter so you can capture PyTorch graphs without forcing users to rewrite their Python code. Rewriting PyTorch Distributed from scratch so you can pdb across a training job. Rewriting all of our C code so it's ABI compatible for another 20 years. Fixing performance problems by changing a single register value from 1 to 0. Utilizing AI systems to optimize PyTorch compiler passes
Required Skills: Software Engineer - Systems ML - PyTorch Responsibilities:
- Improve PyTorch's state of the art training, post-training, and inference on modern AI hardware accelerators.
- Development of PyTorch's software stack with a focus on AI frameworks and high performance kernel development
- Performance tuning and optimizations of deep learning framework & software components.
- Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc.
Minimum Qualifications: Minimum Qualifications:
- Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
- Proven C/C programming skills
- Experience in AI framework development or accelerating deep learning models on hardware architectures.
Preferred Qualifications: Preferred Qualifications:
- Knowledge of GPU, CPU, or AI hardware accelerator architectures.
- Experience working with frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT
- OR AI high performance kernels: Experience with CUDA programming, OpenMP / OpenCL programming or AI hardware accelerator kernel programming. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN, CUTLASS, HIP, ROCm etc.
- OR AI Compiler: Experience with compiler optimizations such as loop optimizations, vectorization, parallelization, hardware specific optimizations such as SIMD. Experience with MLIR, LLVM, IREE, XLA, TVM, Halide is a plus.
- OR AI frameworks: Experience in developing training and inference framework components. Experience in system performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development.
Public Compensation: $85.10/hour to $251,000/year + bonus + equity + benefits
Industry: Internet
Equal Opportunity: Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at .