Data Engineer

Charlotte, North Carolina

Diamondpick
Apply for this Job
Summary:

We are looking for a talented Data Engineer with hands-on experience in Hadoop, Google Cloud Platform (GCP), Apache Spark, and Python to join our growing data engineering team. You will be responsible for building, maintaining, and optimizing data pipelines that support analytics, machine learning, and business intelligence initiatives.

Key Responsibilities:
  • Design, develop, and maintain scalable ETL pipelines using Spark and Python.
  • Process large volumes of structured and unstructured data using Hadoop ecosystem tools (e.g., HDFS, Hive, Pig).
  • Deploy and manage ETL workflows in GCP using tools like Dataflow, Dataproc, Big Query, and Cloud Composer (Airflow).
  • Ensure data accuracy, consistency, and reliability through quality checks and validation processes.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data needs and deliver timely solutions.
  • Optimize data processing jobs for performance and cost-efficiency in cloud environments.
  • Maintain documentation of ETL processes, data flows, and system architectures.
  • Implement best practices in data security, governance, and compliance.
Required Qualifications:
  • Bachelor's degree in Computer Science, Data Engineering, or related field.
  • 3+ years of experience in data engineering or ETL development.
  • Proficiency in Python for data processing and automation.
  • Strong hands-on experience with Apache Spark (PySpark preferred).
  • Experience working with Hadoop ecosystem (HDFS, Hive, Oozie, etc.).
  • Proven experience with GCP data services (Big Query, Dataflow, Dataproc, Cloud Storage, Cloud Composer).
  • Strong SQL skills and understanding of data modeling and warehousing principles.
  • Experience working with both batch and streaming data.
Preferred Skills:
  • GCP Professional Data Engineer certification.
  • Familiarity with CI/CD and DevOps practices for data pipelines.
  • Exposure to containerization tools (Docker, Kubernetes) in data environments.
  • Experience with version control (e.g., Git) and workflow orchestration tools like Airflow.
Soft Skills:
  • Analytical mindset with a strong attention to detail.
  • Excellent problem-solving and debugging skills.
  • Strong communication and collaboration abilities.
Ability to work in an agile, fast-paced, and collaborative environment.
Date Posted: 09 May 2025
Apply for this Job