Summary
Prodigy Resources is seeking a Senior AI Data Engineer to design, build, and maintain cloud-native data platforms that power advanced analytics, real-time data applications, and AI/ML workflows. This role is critical to enabling machine learning in production environments by delivering robust, scalable, and high-quality data architectures.
Key Responsibilities
- Design and maintain scalable batch and real-time data pipelines using AWS-native tools.
- Develop and manage data lake and lakehouse architectures leveraging S3, Glue Catalog, Athena, EMR (Iceberg/Delta Lake), and Redshift.
- Collaborate with ML engineers and data scientists to operationalize ML models and support MLOps pipelines.
- Ensure data quality, observability, lineage tracking, and compliance across all data products.
- Design scalable data models for storage solutions including Amazon DynamoDB and other AWS-based services.
- Build and maintain online/offline feature stores to support low-latency AI applications.
- Contribute to decentralized data ownership and federated data governance models using tools such as Amazon DataZone.
- Partner with product managers, engineers, and analysts to align data infrastructure to strategic business goals.
- Design systems leveraging event-driven architectures and microservices principles.
- May lead projects and support cross-functional initiatives as needed.
- Occasional business travel may be required.
- Required Skills and QualificationsExpertise in Python and SQL for ETL/ELT, data transformation, and automation.
- Deep experience with AWS services, including S3, Glue, Athena, Redshift, DynamoDB, Kinesis/MSK, Lambda, Step Functions, and EMR.
- Strong understanding of modern data architectures, including data lakes, lakehouses, and data warehouses.
- Hands-on experience with Apache Spark, Kafka, and orchestration tools such as Airflow.
- Familiarity with DevOps and infrastructure-as-code (IaC) tools like Pulumi, CloudFormation, or AWS CDK.
- Knowledge of data governance, security, and compliance (e.g., HIPAA, GDPR).
- Proficiency in semantic layer design using AWS Glue Catalog or third-party platforms like Alation and Collibra.
- Ability to integrate semantic assets into lakehouse ecosystems.
- Strong communication skills with the ability to work cross-functionally in Agile environments.
- A problem-solving mindset, attention to detail, and a commitment to delivering high-quality solutions.
- Education and ExperienceBachelor's or Master's degree in Computer Science, Engineering, or a related technical field required.
- AWS certifications (Data Analytics, Machine Learning) or equivalent credentials preferred.
- Core CompetenciesDrive for Results: Acts with urgency, persistence, and strategic focus to achieve ambitious goals.
- Customer Focus: Proactively anticipates needs and exceeds expectations through reliable, empathetic support.
- Self-Awareness: Reflects thoughtfully on actions and adapts behavior to drive better outcomes.
- Valuing Others: Builds trust and respect across diverse teams through transparent, inclusive collaboration.
- Learning Agility: Continuously seeks growth opportunities, feedback, and innovative ideas.
- Innovation: Brings forward and experiments with new ideas to drive efficiency, scalability, and value.
- Core ValuesTreat everyone with dignity and Respect
- Earn and maintain Trust
- Provide Reliable services that open doors
- Serve with courtesy and Compassion
- Prioritize Safety
- Communicate with Transparency