Lead Data Engineer

New York, New York

District Partners LLC

District Partners is engaged with a leading global organization at the intersection of professional services and technology, where innovation is key to delivering exceptional solutions. The client is starting a new innovation team and is seeking a Lead Data Engineer to design, build, and manage modern data infrastructure in support of enterprise-wide digital transformation. This role is central to establishing and scaling a robust data lake environment, enabling advanced analytics, AI, and data-driven decision-making across the organization. Reporting to the Director of Data Science and Engineering, the Lead Data Engineer will architect and optimize data pipelines and platforms to meet evolving business and technical needs.

This is a remote position; however, to be considered, candidates must reside in or near one of the following locations: New York, Pennsylvania, San Francisco, Los Angeles, Santa Monica, Boston, Washington, D.C., Reston, Chicago, San Diego County, or Seattle. Compensation for this role will vary based on geographic location and local market factors.

Key Responsibilities:

Design, develop, and maintain scalable data pipelines for ingesting, transforming, and storing both structured and unstructured data
Optimize and tune existing pipelines for improved performance, reliability, and efficiency
Collaborate with stakeholders to translate business requirements into scalable, technical data solutions
Implement and enforce data governance, data quality, and security protocols in compliance with regulatory and internal standards
Drive the development of data lake architecture and best practices to support enterprise-level data initiatives
Troubleshoot and resolve data processing, performance, and infrastructure issues as they arise
Perform other related duties as required

Required Skills & Experience:

5+ years of experience with data lake and data engineering platforms (e.g., Databricks, Unity Catalog, Delta Sharing, Auto Loader, Delta Live Tables, Snowflake, Spark, Hadoop, Hive)
Strong background in SQL databases and programming, with proficiency in Python/PySpark
Solid understanding of data architecture, ETL processes, and data pipeline design
Experience working with both structured and unstructured data in cloud environments
Skilled in optimizing large-scale data environments for scalability and performance
Familiarity with enterprise platforms such as Salesforce
Deep understanding of cloud infrastructure and computing principles
Experience implementing data security, governance, and compliance frameworks

Preferred Qualifications:

Bachelor's degree or higher in Computer Science, Engineering, or a related field
Experience with data modeling, warehousing, and metadata management
Working knowledge of RESTful APIs and integration methods
Relevant Data Engineering certifications (e.g., Databricks, Snowflake, Azure)
Exposure to machine learning workflows and generative AI technologies

Core Competencies:

Proven leadership capabilities in cross-functional, matrixed team environments
Strong interpersonal and communication skills, with the ability to clearly explain technical concepts to non-technical stakeholders
Ability to influence and collaborate across multiple departments and technical teams
Analytical mindset with strong problem-solving skills and attention to detail
Comfortable working in a fast-paced, evolving environment, both independently and as part of a team

Date Posted: 07 April 2025

Apply for this Job

Show me similar jobs

Send me jobs by email