Software Engineer

Columbia, Maryland

LINK
Apply for this Job
Description: We are seeking a highly skilled and motivated Sr. LLM Engineer to join our team in driving the advancement of our Language Model infrastructure. As a key member of our AI/ML team, you will be responsible for the training, hosting, and optimization of Large Language Model (LLM) instances within our compute environment. The ideal candidate should possess a strong passion for pushing the boundaries of language technology, a deep understanding of LLM architectures, and the grit to tackle complex challenges head-on. This role requires a self-reliant individual with a drive to identify and fix inefficiencies, constantly striving to improve the codebase and optimize model performance. If you thrive in a fast-paced environment and have an unwavering commitment to delivering cutting-edge language solutions, this position is for you.

Responsibilities:

• Design, develop, and maintain the infrastructure for training, hosting, and serving LLM instances.

• Optimize model training pipelines to achieve high performance and resource efficiency.

• Implement and integrate state-of-the-art LLM architectures and techniques.

• Collaborate with cross-functional teams to understand business requirements and deliver impactful language solutions.

• Monitor and analyze model performance metrics, identifying areas for improvement and implementing optimizations.

• Develop and maintain documentation, best practices, and coding standards for LLM development and deployment.

• Stay up-to-date with the latest advancements in LLM research and industry trends, and incorporate them into our projects.

• Mentor and guide junior engineers, fostering a culture of continuous learning and knowledge sharing.

Skills Requirements:

• 12+ years of experience in software engineering, with a focus on machine learning or natural language processing.

• Degree in Computer Science, Artificial Intelligence, or a related field.

• Strong expertise in deep learning frameworks such as TensorFlow, PyTorch, or MXNet.

• Proficiency in programming languages such as Python, C , or Java.

• Solid understanding of LLM architectures, training techniques, and evaluation methodologies.

• Familiarity with cloud platforms (e.g., AWS, GCP) and their machine learning services.

• Knowledge of software engineering best practices, including version control, testing, and continuous integration/deployment.

• Excellent problem-solving and debugging skills.

• Strong communication and collaboration abilities to work effectively with cross-functional teams.

Nice to Haves:

• Advanced degree (Master's or Ph.D.) in Computer Science, Artificial Intelligence, or a related field.

• Proven track record of implementing and deploying large-scale LLM systems in production environments.

• Experience with distributed computing frameworks like Apache Spark or Hadoop.

• Experience with natural language understanding, generation, and dialogue systems.

• Familiarity with techniques such as transfer learning, few-shot learning, and reinforcement learning.

• Contributions to open-source projects or research publications in the field of LLMs.

• Experience with serving models using APIs and building scalable inference pipelines.

• Knowledge of DevOps practices and tools like Docker, Kubernetes, and Jenkins.

YOE Requirement: 12 yrs., B.S. in a technical discipline or 4 additional yrs. in place of B.S.
Date Posted: 01 April 2025
Apply for this Job