Data Quality Engineer Onsite

Pennsylvania

CyberTec

Job Title: Data Quality Engineer - onsite .
Location: ONSITE in either Bethlehem, PA (preferred ) or Holmdel, NJ.
Job Type : Hybrid
Duration: 6 months
Opening : 1 Opening .

Seeking an experienced Data Engineer to be part of our Data and Analytics organization. You will be playing a key role in building and delivering best-in-class data and analytics solutions aimed at creating value and impact for the organization.
Responsibilities to include:

Architect, build, and maintain scalable and reliable data pipelines including robust data quality as part of data pipeline which can be consumed by analytics and BI layer.
Design, develop and implement low-latency, high-availability, and performant data applications and recommend & implement innovative engineering solutions.
Design, develop, test and debug code in Python, SQL, PySpark, bash scripting as per company standards.
Design and implement data quality framework and apply it to critical data pipelines to make the data layer robust and trustworthy for downstream consumers.
Design and develop orchestration layer for data pipelines which are written in SQL, Python and PySpark.
Apply and provide guidance on software engineering techniques like design patterns, code refactoring, framework design, code reusability, code versioning, performance optimization, and continuous build and Integration (CI/CD) to make the data analytics team robust and efficient.
Performing all job functions consistent with company policies and procedures, including those which govern handling PHI and PII.
Develop relationships with business team members by being proactive, displaying an increasing understanding of the business processes and by recommending innovative solutions.
Communicate project output in terms of customer value, business objectives, and product opportunity.

Requirements:

5+ years of experience with Bachelors / Master's degree in computer science, Engineering, Applied mathematics or related field.
Extensive hands-on development experience in Python, SQL and Bash, pytest.
Extensive Experience in performance optimization of data pipelines.
Extensive hands-on experience working with cloud data warehouse and data lake platforms like Databricks, Redshift or Snowflake.
Familiarity with building and deploying scalable data pipelines to develop and deploy Data Solutions using Python, SQL, PySpark.
Extensive experience in all stages of software development and expertise in applying software engineering best practices.
Experience in developing and implementing Data Quality framework either home grown or using any open-source frameworks like Great Expectations, Soda, Deequ.
Extensive experience in developing end-to-end orchestration layer for data pipelines using frameworks like Apache Airflow, Prefect, Databricks Workflow.
Familiar with RESTful Webservices (REST APIs) to be able to integrate with other services.
Familiarity with API Gateways like APIGEE to secure webservice endpoints.
Familiarity with concurrency and parallelism.
Familiarity with Data pipelines and Client development cycle.

Date Posted: 14 May 2025

Apply for this Job

Show me similar jobs

Send me jobs by email