Python Data Engineer

Scottsdale, Arizona

Impresiv Health
Apply for this Job
Not accepting 3rd party vendors for this position

Job Title: Python Data Engineer
Duration: Temp to Perm
Location: Remote

Description:
We are looking for a Python Data Engineer that will bring strong expertise in CMS datasets (MOR, MMR, MAO) and an understanding of healthcare regulations. The role requires proficiency with modern cloud data engineering tools, including Dataflow, BigQuery, and Airflow for orchestration, along with solid foundational knowledge in data warehousing concepts and optimization techniques for large healthcare datasets.

What You Will Do:
  • Design, develop, and maintain scalable ETL pipelines for CMS datasets using GCP Dataflow and Python.
  • Architect and manage data warehouses using BigQuery, ensuring scalability and cost-efficiency.
  • Implement Airflow DAGs for orchestration of complex data workflows and scheduling.
  • Ensure data quality, validation, lineage, and governance aligned with CMS and HIPAA compliance standards.
  • Optimize large-scale datasets through partitioning, clustering, sharding, and cost-effective query patterns in BigQuery.
  • Work collaboratively in Agile teams, using Jira for project tracking and Confluence for documentation.
  • Monitor and troubleshoot data pipelines, ensuring reliability and operational excellence.
You Will Be Successful If:
  • Self-motivated, proactive, and capable of thriving in a fast-paced, agile startup environment with minimal supervision.
  • Demonstrates strong ownership of tasks and deliverables, acting as a task master.
  • Eager self-learner who stays current with emerging technologies and industry trends.
  • Excellent communication skills, both written and verbal, to effectively collaborate across multidisciplinary teams.
What You Will Bring:
  • Bachelor's degree in Computer Science, Information Systems, or related field.
  • 3+ years of experience in cloud-based data engineering, preferably with healthcare datasets.
  • Expertise in building ETL pipelines using GCP Dataflow (Apache Beam) and Python.
  • Strong experience with BigQuery including schema design, optimization, and advanced SQL.
  • Hands-on experience with Airflow orchestration for large-scale data workflows.
  • Deep understanding of data warehouse concepts such as star schema, snowflake schema, normalization, denormalization.
  • Proficiency in dataset optimization techniques: query optimization, partitioning, clustering.
  • Familiarity with Agile processes, Jira, Confluence, and cloud-native engineering best practices.
  • Knowledge of CMS datasets (MOR, MMR, MAO) and healthcare data compliance (HIPAA).
About Impresiv Health:

Impresiv Health is a healthcare consulting partner specializing in clinical & operations management, enterprise project management, professional services, and software consulting services. We help our clients increase operational efficiency by delivering innovative solutions to solve their most complex business challenges.

Our approach is and has always been simple. First, think and act like the customers who need us, and most importantly, deliver what larger organizations cannot do - provide tangible results that add immediate value, at a rate that cannot be beaten. Your success matters, and we know it.

That's Impresiv.
Date Posted: 05 May 2025
Apply for this Job