Job Description Object Technology Solutions, Inc (OTSI) has an immediate opening for an Sr. Data Engineer
Location: Irving, TX-Onsite
JOB DESCRIPTION:
- Analyze and understand data sources & APIs
- Design and Develop methods to connect & collect data from different data sources
- Design and Develop methods to filter/cleanse the data
- Design and Develop SQL , Hive queries, APIs to extract data from the store
- Work closely with data Scientists to ensure the source data is aggregated and cleansed
- Work with product managers to understand the business objectives
- Work with cloud and data architects to define robust architecture in cloud setup pipelines and work flows
- Work with DevOps to build automated data pipelines
Total Experience Required
- 6
- 5+ years of experience with Hadoop (Cloudera)/big data technologies
- Advanced knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)
- Experience on designing and developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python.
- Experience with Spark programming (pyspark or scala or java)
- Expert level building pipelines using Apache Spark Familiarity with core provider services from AWS, Azure or Google Cloud Platform, preferably having supported deployments on one or more of these platforms
- Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required;
- Exposure to containerization and related technologies (e.g. Docker, Kubernetes)
- Exposure to aspects of DevOps (source control, continuous integration, deployments, etc.)
- Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus.
- System level understanding - Data structures, algorithms, distributed storage & compute
- Can-do attitude on solving complex business problems, good interpersonal and teamwork skills
- Possess team management experience and have led a team of data engineers and analysts.
- Experience in Snowflake is a plus.
Desirable Technical Skills
- Familiarity with HTTP and invoking web-APIs
- Exposure to machine learning engineering
- Exposure to NLP and text processing
- Experience with pipelines, job scheduling and workflow management
Personal Skills
Experienced in managing work with distributed teams
- Experience working in SCRUM methodology
- Proven sense of high accountability and self-drive to take on and see through big challenges
- Confident, takes ownership, willingness to get the job done
- Excellent verbal communications and cross group collaboration skills