Position-: GenAI Platform Support Engineer
Location: Hartford, CT Onsite Local only
Skill-: (Infra / Cloud Exp / Devops / Handling Deployment / Solution / Tools & Techniques - Teraform / AWS/Pythons)
Key Responsibilities:
Assess and enhance the AI platform's data pipeline resilience.
Ensure that AI/ML models are fault-tolerant and scalable.
Identify bottlenecks in model inference and training pipelines.
Optimize model performance for real-time use cases.
Collaborate with the DevOps team on deployment improvements.
Implement automated testing for fault-tolerance scenarios.
Provide technical support for users of the Generative AI platform, troubleshooting issues and answering queries.
Monitor platform performance and implement optimization strategies.
Document processes, issues, and solutions for knowledge sharing and future reference.
Stay up-to-date with industry trends and advancements in generative AI technologies.
Qualifications:
Bachelor's degree in Computer Science, Engineering, or a related field.
Proven experience on support and maintenance.
Experience with database management (SQL, NoSQL).
Familiarity with cloud platforms (AWS, GCP, Azure).
Familiarity on provisioning Azure/AWS cloud services with Terraform script
Having knowledge on OpenAI will be an added advantage
Good knowledge on Shell Scripting/Python
Having knowledge on provisioning GEN AI Cloud Services on AWS/Azure will be an advantage
Must have knowledge on Kubernetes on OpenShift
Should have experience with troubleshooting services deployed on Kubernetes
Must have knowledge on DevOps Build/Release pipelines & understanding the Infrastructure needs
Excellent problem-solving skills and attention to detail.
Strong communication skills and ability to work collaboratively.
Knowledge of containerization (Docker, Kubernetes).