Senior On-Premises Distributed Systems EngineerBuilding Specialized High-Performance Computing Infrastructure for Transcriptome Analysis
At the intersection of cutting-edge genomics and precision engineering, we are seeking a highly skilled engineer to architect, deploy, and operate our next-generation in-house distributed computing cluster - designed specifically to support large-scale transcriptome data analysis with high-throughput, low-latency precision.
Your MissionYou will build a purpose-driven high-performance computing (HPC) cluster from the ground up - designed around 5 to 10 nodes - to process terabytes of sequencing data with efficiency beyond what generic cloud services can provide. Your infrastructure will be the computational backbone for breakthrough discoveries in transcriptomics and computational biology.
Key Responsibilities Architect and Deploy Specialized HPC Infrastructure
Design, build, and optimize a bare-metal distributed computing environment, including hardware selection, rack design, node setup, and network architecture.
Implement Custom Cluster Management Solutions
Deploy and fine-tune resource managers (Slurm, PBS, SGE) for efficient job scheduling and high utilization across a small-scale, high-performance cluster.
Design High-Performance Storage Systems
Architect and implement scalable, high-bandwidth distributed storage (such as Lustre, BeeGFS, or Ceph) optimized for transcriptome sequencing workloads.
Optimize Network Topology for HPC Workloads
Build low-latency, high-throughput networks using high-speed Ethernet or InfiniBand to maximize node-to-node communication for parallel computing tasks.
Develop Custom Solutions for Bioinformatics Workloads
Create tailored hardware/software pipelines for specialized needs like RNA-Seq analysis, distributed transcript quantification, or real-time expression profiling.
Collaborate Across Disciplines
Work directly with bioinformaticians, data scientists, and machine learning researchers to align hardware architecture with algorithmic needs.
Technical RequirementsEducation:
Bachelor's degree in Computer Science, Computer Engineering, Systems Engineering, or a related technical field required
Master's or PhD in High-Performance Computing, Distributed Systems, or a related field preferred
Equivalent practical experience building distributed infrastructure will be considered
Core Skills:
Strong background in bare-metal systems engineering, distributed computing, and HPC architecture
Proven experience designing and operating on-premises clusters (5-50 nodes preferred)
Deep understanding of parallel processing, storage system optimization, and high-speed networking
Technology Stack:
Cluster Management: Slurm, PBS, SGE, HTCondor, Kubernetes (on bare-metal)
Distributed Storage: Lustre, BeeGFS, Ceph, HDFS, object storage tuning
Networking: InfiniBand, RDMA over Ethernet, 10/25/40/100G networking
Performance Monitoring: Prometheus, Grafana, Ganglia, Nagios
Hardware Management: IPMI, BMC, hardware health and diagnostics tools
Bonus Qualifications Experience with accelerators (GPUs, FPGAs) for computational biology workloads
Familiarity with bioinformatics file formats (FASTQ, BAM, GTF) and their storage implications
Background in scientific computing centers, genomics research labs, or national lab HPC projects
Experience integrating on-prem clusters with cloud burst capacity (hybrid setups)
What Sets You Apart 3+ years of hands-on experience designing, implementing, and optimizing bare-metal HPC clusters
Strong ownership and leadership in physical infrastructure projects
Practical experience balancing compute-intensive and I/O-intensive transcriptome analysis workloads
Ability to design for both current needs (5-10 node cluster) and future scalability (20+ nodes, hybrid extensions)
Example Projects You Will Lead Architect our next-generation 5-10 node transcriptome processing cluster capable of handling terabytes of RNA-Seq data
Design specialized I/O and memory architectures for distributed genomics file processing
Build resilient infrastructure balancing compute-bound and I/O-bound bioinformatics workloads
Integrate real-time RNA expression pipelines with custom-built distributed storage solutions
Our Team EnvironmentYou will collaborate with multidisciplinary teams of bioinformaticians, data scientists, software engineers, and AI researchers.
While they innovate on algorithms and scientific analysis, you will architect and operate the custom-built infrastructure that enables their research at scale.
We Encourage Applications FromCandidates with backgrounds in research computing centers, supercomputing facilities, genomics labs, or national HPC projects - We value hands-on experience with physical infrastructure design and implementation over purely cloud-based experience.
Join us to architect the specialized computing infrastructure that will enable the next generation of transcriptome research and breakthrough discoveries in genomics.