Senior Distributed Systems Engineer

Frisco, Texas

Ayass BioScience, LLC
Job Expired - Click here to search for similar jobs

Shape the Future of Cloud-Native Big Data Architecture

At the intersection of massive scale and cutting-edge technology, we're looking for an exceptional engineer to architect next-generation distributed systems that process petabytes of data with millisecond precision.

Your Mission

You'll design highly resilient, auto-scaling distributed systems that handle complex data workloads across global infrastructure. Your architectures will become the backbone of our data ecosystem, powering critical business decisions and customer-facing features.

Key Responsibilities

  • Architect Scalable Data Platforms - Design and implement high-throughput distributed processing systems using Spark, Flink, Dask, or Ray that can handle exponential data growth
  • Engineer Parallel Computing Solutions - Build sophisticated batch and real-time pipelines capable of processing complex workloads with minimal latency
  • Optimize for Performance at Scale - Fine-tune data partitioning strategies, memory utilization, and computation models to achieve maximum throughput in multi-node environments
  • Deploy Cloud-Native Infrastructure - Work with our DevOps team to implement infrastructure-as-code solutions on AWS, GCP, or Azure using containerization and orchestration technologies
  • Implement Observability Systems - Create comprehensive monitoring solutions that provide deep visibility into distributed system performance
  • Drive Technical Excellence - Mentor junior engineers, contribute to architectural decisions, and evaluate emerging technologies that could provide competitive advantages

Technical Requirements

Education:

  • Bachelor's degree in Computer Science, Software Engineering, Data Science, or related technical field required
  • Master's degree or PhD in Distributed Systems, High-Performance Computing, or related specialization preferred
  • Equivalent practical experience will be considered for exceptional candidates with demonstrated expertise

Core Skills:

  • Deep expertise in Python, Java, or Scala with a strong focus on distributed system design patterns
  • Proven experience with fault-tolerance, consensus algorithms, and distributed computing principles
  • Production-level experience with major cloud platforms and their data service offerings

Technology Stack:

  • Processing Frameworks: Apache Spark, Apache Flink, Dask, Ray, Apache Beam
  • Streaming Technologies: Kafka, Pulsar, Kinesis, Dataflow
  • Orchestration: Kubernetes, Airflow, Argo Workflows, Prefect
  • Storage: HDFS, S3, Delta Lake, Parquet, ORC, Cloud-native object stores
  • Infrastructure: Terraform, Docker, Helm, CI/CD pipelines

Additional Qualifications:

  • Experience with ML infrastructure components (vector databases, feature stores, LLM deployment)
  • Contributions to open-source data engineering projects
  • Knowledge of high-performance computing techniques including GPU acceleration
  • Background in advanced performance tuning and distributed system debugging

What Sets You Apart

  • 5+ years of hands-on experience building and scaling distributed data systems
  • Exceptional ability to balance theoretical knowledge with practical implementation
  • Track record of ownership for complex, end-to-end architectures in production environments
  • Strong communication skills and collaborative mindset in a fast-paced engineering culture

Join us to build the next generation of scalable, resilient data systems that will transform how our organization leverages its most valuable asset - data.


Date Posted: 28 April 2025
Job Expired - Click here to search for similar jobs