Data Architect

Palo Alto, California

Georgia IT Inc
Apply for this Job
Data Architect
Location: Palo Alto, CA (Hybrid)
Duration: Long Term
Rate: DOE


Key Responsibilities:

Data Orchestration:
  • Design, implement, and manage data workflows using Airflow to automate and orchestrate data processing tasks.
  • Optimize Airflow DAGs (Directed Acyclic Graphs) for performance and scalability.
Task Management:
  • Develop and maintain distributed task processing using Celery and ensure robust task queue management with Redis or RabbitMQ.
Database Management:
  • Design and manage databases using Cosmos DB, MongoDB, and PostgreSQL.
  • Develop and maintain efficient data models and ensure data consistency and integrity.
API and Webhooks:
  • Implement and manage FastAPI webhooks to handle data ingestion and integration tasks.
  • Develop and maintain Azure Functions to support webhook operations and integrate with cloud services.
Streaming Data:
  • Implement and manage Kafka Streams to handle real-time data processing and streaming requirements.
Data Lake Management:
  • Work with Iceberg to manage and optimize large-scale data lake storage and querying.
Collaboration and Communication:
  • Collaborate with data scientists, engineers, and business analysts to understand data requirements and provide technical solutions.
  • Document processes, architectures, and configurations to ensure knowledge sharing and compliance with best practices.
Required Skills and Qualifications:

Experience and Knowledge:
  • Proven experience with Airflow for data orchestration and workflow management.
  • Hands-on experience with Celery for task management and Redis or RabbitMQ for messaging.
  • Proficiency with Cosmos DB, MongoDB, and PostgreSQL for data storage and management.
  • Experience developing and managing webhooks using FastAPI and integrating with Azure Functions.
  • Knowledge of Kafka Streams for real-time data processing.
  • Familiarity with Iceberg for data lake management and optimization.
  • Healthcare domain experience is good to have .
Technical Skills:
  • Strong understanding of data pipelines, ETL processes, and data integration.
  • Proficient in Python, with experience in building and maintaining data-oriented applications.
  • Ability to work with large datasets and optimize performance across distributed systems
Soft Skills:
  • Excellent problem-solving and analytical skills.
  • Strong communication and collaboration skills.
  • Ability to work independently and manage multiple priorities in a fast-paced environment.
Date Posted: 14 May 2025
Apply for this Job