Data Engineer

Galveston, Texas

Annex It Solutions

Job Expired - Click here to search for similar jobs

Job Description: We are seeking a skilled Data Engineer with 3+ years of experience to join our growing data team. The ideal candidate will have a strong background in designing, building, and maintaining scalable data pipelines and systems to manage large volumes of data. You will be responsible for transforming raw data into actionable insights, ensuring data quality, and supporting data-driven decision-making within the organization. Responsibilities: Design, build, and maintain data pipelines that support data integration, transformation, and loading (ETL). Work with large, complex data sets from various sources (structured, semi-structured, unstructured). Optimize and automate data workflows to ensure reliability and scalability. Collaborate with Data Scientists, Analysts, and Business Intelligence teams to ensure data is accessible and actionable. Implement data validation, quality checks, and ensure consistency across data platforms. Develop and maintain data models for reporting and analysis. Perform data profiling, cleansing, and transformation to improve data quality. Utilize cloud-based data platforms (AWS, GCP, Azure) to scale data infrastructure. Manage and support data warehouses (Redshift, BigQuery, Snowflake, etc.) and data lakes. Write efficient, reusable, and well-documented code for data processing and analysis. Monitor and troubleshoot issues related to data pipelines, ensuring smooth data flow across systems. Participate in cross-functional teams and collaborate with engineering, product, and business units to deliver high-quality solutions. Required Skills and Qualifications: Data Engineering Experience: 3+ years of hands-on experience in data engineering, building and managing data pipelines. ETL Tools: Strong knowledge of ETL tools and technologies (Apache Airflow, Talend, Informatica, etc.). Data Processing: Experience with big data technologies such as Apache Hadoop, Spark, or Kafka. Programming Languages: Proficiency in Python, Java, or Scala for writing data processing scripts. Cloud Platforms: Familiarity with cloud data platforms like AWS (Redshift, S3, Glue), Google Cloud (BigQuery, Dataflow), or Azure. SQL: Advanced knowledge of SQL for querying, transforming, and optimizing large data sets. Database Management: Experience with relational (MySQL, PostgreSQL) and NoSQL databases (MongoDB, Cassandra, etc.). Data Warehousing: Experience in building and maintaining data warehouses or data lakes. Data Modeling: Ability to design and optimize data models to support data analysis and reporting. Version Control: Experience using version control systems like Git. Problem Solving: Strong analytical and troubleshooting skills for handling data-related issues. Preferred Qualifications: Data Visualization: Familiarity with data visualization tools such as Tableau, Power BI, or Looker. Data Integration: Experience integrating data from multiple sources and APIs. DevOps Tools: Familiarity with CI/CD pipelines and tools like Jenkins or GitLab CI for automating data workflows. Machine Learning: Exposure to machine learning pipelines or working knowledge of machine learning frameworks. Data Governance: Understanding of data governance, privacy, and security principles in handling sensitive data. Agile Methodologies: Experience working in Agile teams with iterative development cycles.

Date Posted: 07 May 2025

Job Expired - Click here to search for similar jobs