The Data Engineer will create automated tests to ensure data meets quality standards and routinely works with our Product Development Team, handling requests and configuring connections to our predictive platform. The ideal Data Engineer has a healthy drive to collaborate, gather feedback, and tackle challenges through testing and learning.
RequirementsEssential Functions: - Manage the end-to-end setup of our customers' data: from raw data sources to data ingestion pipelines and connections to our predictive platform
- Analyze and validate customer data from ingestion to production
- Monitor all data update processes and outputs to ensure quality and uptime
- Collaborate with other members of the Data Analytics Program and with the Product Development Team to discuss new use cases or help identify and fix data issues
- Develop and maintain proprietary technical documentation
- Solving day-to-day customer data challenges
- Design and Build Data Pipelines: Develop, maintain, and optimize scalable ETL pipelines for ingesting and transforming structured and semi-structured data from various sources.
- Set up, configure, and maintain Fivetran connectors to ensure data is extracted, loaded, and synchronized properly across systems.
- Data Modeling: Develop and maintain data models to ensure consistency and ease of access for analytics, business reporting, and predictive modeling.
- Collaborate with stakeholders to review and discuss Entity Relationship Diagrams (ERDs) to align database structures with business needs.
- Engage with cross-functional teams to understand their reporting requirements and ensuring data models and pipelines meet those needs for accurate reporting.
- Optimize Data Systems: Monitor and enhance the performance, scalability, and efficiency and accuracy of data with each version of the product.
- Data Governance, Cleaning and Quality: Implement data quality checks and enforce data governance standards to ensure availability, accuracy, cleanliness, and reliability
- Data Warehouse Administration: Administer and manage the organization's cloud data warehouse, ensuring efficient storage, retrieval, and management of data, while optimizing for performance, scalability, and security.
- Manually access and download data from external sources to ensure it is available for processing and analysis.
Required Education and Experience - BS/BA degree in Engineering, Computer Science, Information Technology, or a related technical field
- Proficient with SQL and experience with SQL database design in a business setting
- 2+ years' experience with data models, data quality assurance, and ELT/ETL systems
- Excellent numerical and analytical skills
- Needs to be able to work cross-functionally and collaborate with BI power users, analysts, software engineers, and data scientists to meet data requirements
- Comfortable with unit testing/regression testing data formats across different versions of the data products
Additional Ideal Experience - Master's degree in data engineering or related technical field
- Experience with AWS, Python scripting and Notebooks, Sigma, Fivetran, dbt, High touch reverse ETL, and Snowflake in a business setting
- Experience with AWS architecture: Lambda functions, S3 buckets, Data Streaming.
- Experience with parsing out custom text log file, xml, and json file formats
- Familiarity with different types of data masking : SHA, hash, etc.
- Familiarity with data science and machine learning concepts
- Familiarity with the Agile Scrum frameworks and Product Management tools such as Atlassian Jira and Confluence
- Experience with identifying and fixing issues in existing reports or dashboards in BI tools, to ensure data accuracy and clarity for end users.