Description: Hybrid Greenwood Village, CO
Our client, a leader in their industry, has an amazing opportunity for a Data Scientist (NLP/Topic Modeling). In this role you will:
- Lead the topic modeling process (e.g., clustering, taxonomy) to identify new resolution categories from voice memos and text.
- Develop and refine classification models that map text inputs to specific resolution categories (initially 20+).
- Collaborate with Data/ML Engineers to build data pipelines and ensure model scalability.
Due to client requirement, applicants must be willing and able to work on a w2 basis. For our w2 consultants, we offer a great benefits package that includes Medical, Dental, and Vision benefits, 401k with company matching, and life insurance.
Rate: $68 - $78 / hr. w2
Responsibilities: Topic Modeling & Taxonomy Development - Perform text clustering and exploratory data analysis to uncover new resolution categories.
- Validate, refine, and finalize taxonomy in collaboration with technical and business stakeholders.
NLP Model Development - Design and implement classification models (NLP or GenAI-based) to categorize inputs into the identified resolution types.
- Ensure the model handles both voice-transcribed text and technician memos effectively.
Model Deployment & Validation - Work with Data/ML Engineers to integrate models into production.
- Develop and maintain performance metrics and dashboards.
Cross-Functional Collaboration - Partner with the Data/ML Engineer to ensure smooth data ingestion, transformation, and model deployment.
- Communicate insights and recommendations to stakeholders.
Experience Requirements: Technical Skills & Experience - NLP/LLM Expertise: Experience in text analytics, topic modeling (LDA, clustering), classification techniques, and large language models.
- Programming: Strong Python (pandas, PySpark) skills; Scala/Spark exposure is a plus.
- Cloud: Familiarity with AWS (S3, EC2, EMR, or similar) and some Azure exposure (LLM hosting).
- MLOps: Experience with CI/CD for ML, model monitoring, and data version control is a plus.
- Data Exploration & Visualization: Ability to create meaningful visualizations and insights for data sets.
Additional Qualifications - 5+ years of relevant data science experience (or equivalent).
- Demonstrable GitHub portfolio showcasing NLP or other data science projects.
- Strong communication skills to translate technical findings to stakeholders.