Lead Data Engineer

New York, New York

District Partners LLC
Apply for this Job
District Partners is engaged with a leading global organization at the intersection of professional services and technology, where innovation is key to delivering exceptional solutions. The client is starting a new innovation team and is seeking a Lead Data Engineer to design, build, and manage modern data infrastructure in support of enterprise-wide digital transformation. This role is central to establishing and scaling a robust data lake environment, enabling advanced analytics, AI, and data-driven decision-making across the organization. Reporting to the Director of Data Science and Engineering, the Lead Data Engineer will architect and optimize data pipelines and platforms to meet evolving business and technical needs.

This is a remote position; however, to be considered, candidates must reside in or near one of the following locations: New York, Pennsylvania, San Francisco, Los Angeles, Santa Monica, Boston, Washington, D.C., Reston, Chicago, San Diego County, or Seattle. Compensation for this role will vary based on geographic location and local market factors.

Key Responsibilities:
  • Design, develop, and maintain scalable data pipelines for ingesting, transforming, and storing both structured and unstructured data
  • Optimize and tune existing pipelines for improved performance, reliability, and efficiency
  • Collaborate with stakeholders to translate business requirements into scalable, technical data solutions
  • Implement and enforce data governance, data quality, and security protocols in compliance with regulatory and internal standards
  • Drive the development of data lake architecture and best practices to support enterprise-level data initiatives
  • Troubleshoot and resolve data processing, performance, and infrastructure issues as they arise
  • Perform other related duties as required
Required Skills & Experience:
  • 5+ years of experience with data lake and data engineering platforms (e.g., Databricks, Unity Catalog, Delta Sharing, Auto Loader, Delta Live Tables, Snowflake, Spark, Hadoop, Hive)
  • Strong background in SQL databases and programming, with proficiency in Python/PySpark
  • Solid understanding of data architecture, ETL processes, and data pipeline design
  • Experience working with both structured and unstructured data in cloud environments
  • Skilled in optimizing large-scale data environments for scalability and performance
  • Familiarity with enterprise platforms such as Salesforce
  • Deep understanding of cloud infrastructure and computing principles
  • Experience implementing data security, governance, and compliance frameworks
Preferred Qualifications:
  • Bachelor's degree or higher in Computer Science, Engineering, or a related field
  • Experience with data modeling, warehousing, and metadata management
  • Working knowledge of RESTful APIs and integration methods
  • Relevant Data Engineering certifications (e.g., Databricks, Snowflake, Azure)
  • Exposure to machine learning workflows and generative AI technologies
Core Competencies:
  • Proven leadership capabilities in cross-functional, matrixed team environments
  • Strong interpersonal and communication skills, with the ability to clearly explain technical concepts to non-technical stakeholders
  • Ability to influence and collaborate across multiple departments and technical teams
  • Analytical mindset with strong problem-solving skills and attention to detail
  • Comfortable working in a fast-paced, evolving environment, both independently and as part of a team
Date Posted: 07 April 2025
Apply for this Job