Data Architect

Santa Clara, California

Vimerse InfoTech Inc

Job Description:

What you'll be doing:

Build data pipelines to transport data from a data source to the data lake.
Build data systems and pipelines ensuring that data sources, ingestion components, validation functions, transformation functions, and destination are well understood for implementation.
Develop and implement the new End-to-End Data systems for our Planning, Logistics and Services initiatives.
Prepare data for prescriptive and predictive modeling by making sure that the data is complete, has been cleansed, and has the necessary rules in place.
Analyze and organize raw operational data including structured and unstructured data.
Lead discussions with stakeholders and IT to identify and implement the right data strategy given data sources, data locations, and use cases.
Interpret trends and patterns by performing complex data analysis.
Build/develop algorithms, prototypes, and analytical tools that enable the Ops teams to make critical business decisions.

What we need to see:

Master's or Bachelor's degree in Computer Science or Information System, or equivalent experience.
8+ years of relevant experience including programming knowledge (i.e Python, SQL).
5+ years of relevant experience in big data technologies and cloud platforms (i.e Spark, AWS).
5+ years of relevant experience in data lake technologies (i.e Iceberg, Delta, Huidi).
5+ years of experience in development best practices like CICD, Unit testing, Integration testing
5+ years of experience grabbing data from source systems like REST APIs, other databases using JDBC, ODBC, SFTP servers.
Differentiating skill sets:
- 2+ years of experience with kubernetes and docker
Experience in developing required infrastructure for optimal extraction, transformation, and loading of data from various sources using AWS, Azure, SQL or other technologies.
Experience architecting, designing, developing, and maintaining data warehouses/data lakes for complex data ecosystems.
Experience working with large datasets, databases and the software used to analyze the data.
Expert in data and database management including data pipeline responsibilities in replication and mass ingestion, streaming, API and application and data integration.
Strong analytical skills with the ability to collect, organize, and disseminate significant amounts of information with attention to detail and accuracy.
Highly independent, able to lead key technical decisions, influence project roadmap and work effectively with team members.

Date Posted: 03 May 2025