Data Quality Engineer Onsite

Pennsylvania

CyberTec
Apply for this Job
Job Title: Data Quality Engineer - onsite .
Location: ONSITE in either Bethlehem, PA (preferred ) or Holmdel, NJ.
Job Type : Hybrid
Duration: 6 months
Opening : 1 Opening .

Seeking an experienced Data Engineer to be part of our Data and Analytics organization. You will be playing a key role in building and delivering best-in-class data and analytics solutions aimed at creating value and impact for the organization.
Responsibilities to include:
  • Architect, build, and maintain scalable and reliable data pipelines including robust data quality as part of data pipeline which can be consumed by analytics and BI layer.
  • Design, develop and implement low-latency, high-availability, and performant data applications and recommend & implement innovative engineering solutions.
  • Design, develop, test and debug code in Python, SQL, PySpark, bash scripting as per company standards.
  • Design and implement data quality framework and apply it to critical data pipelines to make the data layer robust and trustworthy for downstream consumers.
  • Design and develop orchestration layer for data pipelines which are written in SQL, Python and PySpark.
  • Apply and provide guidance on software engineering techniques like design patterns, code refactoring, framework design, code reusability, code versioning, performance optimization, and continuous build and Integration (CI/CD) to make the data analytics team robust and efficient.
  • Performing all job functions consistent with company policies and procedures, including those which govern handling PHI and PII.
  • Develop relationships with business team members by being proactive, displaying an increasing understanding of the business processes and by recommending innovative solutions.
  • Communicate project output in terms of customer value, business objectives, and product opportunity.
Requirements:
  • 5+ years of experience with Bachelors / Master's degree in computer science, Engineering, Applied mathematics or related field.
  • Extensive hands-on development experience in Python, SQL and Bash, pytest.
  • Extensive Experience in performance optimization of data pipelines.
  • Extensive hands-on experience working with cloud data warehouse and data lake platforms like Databricks, Redshift or Snowflake.
  • Familiarity with building and deploying scalable data pipelines to develop and deploy Data Solutions using Python, SQL, PySpark.
  • Extensive experience in all stages of software development and expertise in applying software engineering best practices.
  • Experience in developing and implementing Data Quality framework either home grown or using any open-source frameworks like Great Expectations, Soda, Deequ.
  • Extensive experience in developing end-to-end orchestration layer for data pipelines using frameworks like Apache Airflow, Prefect, Databricks Workflow.
  • Familiar with RESTful Webservices (REST APIs) to be able to integrate with other services.
  • Familiarity with API Gateways like APIGEE to secure webservice endpoints.
  • Familiarity with concurrency and parallelism.
  • Familiarity with Data pipelines and Client development cycle.
Date Posted: 14 May 2025
Apply for this Job