RTML Engineer

Irving, Texas

InfoVision, Inc.
Job Expired - Click here to search for similar jobs

Job Title: Real time Machine learning Engineer (RTML Engineer)

Location: Irving, TX or Basking Ridge, NJ or Tampa, FL or San Diago, VA

Duration: Long-term

Responsibilities:

  • Functioning as a domain expert in the area of RTML model serving technology, familiar with the industrial trends in RTML, common RTML architectures, leading 3rd-party RTML serving products, and evaluation criteria's
  • Working closely with other teams to define technical strategy, architecture, development choices and ensure overall growth of the Jarvis Framework to meet our internal customers needs.
  • Leading the Jarvis development activities through phased releases, ensuring it is architecturally sound, implemented correctly/efficiently, and delivered on time.
  • Supporting internal customers with major framework issues and coordinating triage efforts to solve them.
  • Lead and mentor junior developers in the team and always pushing for team successes.
  • Adhering to industry standards and best practices and tracking emerging RTML technologies and trends to continuously improve the Jarvis framework.

Skills Needed:

  • Bachelor s degree or above in Computer Science/Engineering or other related areas.
  • Four or more years of work experience in computer software development related jobs.
  • At least two years are in AI / ML Engineering areas with reasonably good understanding of Data Science and AIML practices/workflows.
  • Strong expertise in RTML model serving arena and/or large-scale cloud-based RT framework development.
  • Experience with programming languages such as Python and Java.
  • Experience in large application development in cloud environments- AWS, Google Cloud Platform and On-Prem clusters.
  • Experience in K8s architecture and principle of operations, hands-on skills of deploying large applications in production K8s cluster, configuring K8s properly, and troubleshooting when the application has issues.
  • Good understanding of RT system stats collection and performance monitoring methods.
  • Basic understanding of RT Feature Engineering methodology and practices.

Even better if you have one or more of the following:

  • Understand basic data science concepts and common needs from data scientists.
  • Familiarity with common ML modeling frameworks such as TensorFlow and PyTorch.
  • Experience with a scalable distributed model serving framework like Ray Serve will be very helpful.
  • Strong collaboration skills and communication skills, especially when involving (non-tech) business stakeholders.
  • Experience with cloud infrastructures and MLOps in clouds.

Familiar with CI/CD process and common frameworks such as ArgoCD

Date Posted: 15 May 2024
Job Expired - Click here to search for similar jobs