Interview: virtual
Visa: USC and GC (Citizen is preferred)
Hybrid: Washington, DC
JD:
Summary: The Data Scientist is responsible for combining their expert knowledge in statistical analysis, data mining, data engineering, and ML/AI to build complex end-to-end business analytics, artificial intelligence, and IT solutions. This is a customer-facing role within a cross-functional team so the ability to manage timelines, work both autonomously and collaboratively, and communicate effectively are a must. As this position will help to support federal contracts with security requirements, you must be a US Citizen to qualify.
Responsibilities: - Ability to build a full data pipeline from data ingestion to processing/transformation to load to visualization and analysis.
- Strong working experience with descriptive and predictive data mining tasks as well as key algorithms including ML and NLP-based learning techniques.
- Understand and develop quantitative simulations and models as part of larger decision systems.
- Use managed services within AWS or Azure to process structured and unstructured data into vectorized embeddings for model and search consumption.
- Develop and build modern AI-based applications using LLMs and RAG.
- Identify and incorporate new data sources into centralized, interoperable data stores vi variety of techniques including web scraping, APIs, secure file transfer, etc.
- Perform sentiment analysis, topic modeling, and text classification to understand opinions, identify trends, and extract key themes.
- Identify and present on insights, trends, and problems through complex big data analysis to both technical and non-technical audiences
- Collaborate with team members across the company to support client capture, proposal writing, and sales development
- Support the technology team in staying current in emerging tools and techniques in machine learning, statistical modeling, and analytics.
- Create clear and compelling visualizations to communicate data-driven insights to stakeholders.
Skills you bring: - Advanced Degree in Statistics, Applied Mathematics, Data Science, Computer Science, Operations Research or other closely related other quantitative or mathematical disciplines.
- 5+ years of relevant data science and analytics experience.
- 3+ years utilizing Python for modeling and data work with strong working expertise in standard libraries such as NumPy, pandas, matplotlib, and scikit-learn.
- 2+ years building and maintaining data pipelines in AWS or Azure.
- Expertise with SQL
- Experience with use of large cross-sectional and time series data.
- Experience analyzing textual data using the latest NLP techniques.
- Substantial knowledge and understanding of statistical concepts.
- Demonstrated exceptional oral and written communication skills.
- The ability to work independently and in a team environment.
- The ability to work effectively across functions, levels and disciplines.
- Strong problem solving and critical thinking skills.
- Superior team-working skills, and a desire to learn, contribute, and explore.