Job Title: Senior Infrastructure Architect
Location: 235 S Grand Ave, Lansing, MI, 48933
Duration: 1 year with possible extension
Work Model: Hybrid (On-site 3 days/week, local candidates only within 90 minutes of Lansing, MI)
Interview: Virtual via MS Teams
Position Overview: Join our dynamic team as a
Senior Infrastructure Architect. In this role, you will leverage your expertise in High-Performance Computing (HPC) systems to design, deploy, and manage cutting-edge infrastructure and data environments. This is an exciting opportunity to contribute to innovative solutions and drive efficiencies within our organization.
Responsibilities: - Manage and enhance High-Performance Computing (HPC) systems and environments
- Develop and maintain robust data management infrastructure
- Support SAN and NAS storage systems, as well as backup/recovery and virtualization platforms
- Contribute to strategic disaster recovery (DR) planning, testing, and ensuring high availability
- Provide comprehensive technical support, including installation, configuration, upgrades, and troubleshooting
- Utilize automation tools such as Ansible, Puppet, or Chef for efficient configuration
- Administer Mellanox switches, NAS clusters, and high-speed storage solutions
- Oversee cloud environments, managing compute engines and storage buckets
- Administer and support relational databases like PostgreSQL, MySQL, Oracle, and SQL Server
- Provide resources and support to internal teams and laboratories
Top Required Skills & Experience: - 10+ years of Linux system administration (Ubuntu, CLI, system security, and networking)
- 10+ years of scripting proficiency with Bash and Python, particularly for pipeline automation (e.g., Nextflow)
- 10+ years of experience with the Slurm workload manager (installation, configuration, troubleshooting)
- 10+ years in cluster and storage management, including NAS (Qumulo), rsync, and mount strategies
- Experience in database management (PostgreSQL, MySQL, Oracle, SQL Server)
- Familiarity with configuration management tools (Ansible, Puppet, Chef)
- Strong skills in firewall, memory management, load balancing, VMs, and system monitoring
- Expertise in enterprise storage solutions, SAN/NAS architecture, and Mellanox switching
- Hands-on experience with cloud platforms and object storage (buckets)
- Understanding of containerization and package management (Conda, Docker, Singularity)
- Familiarity with HL7 messaging and web.config interpretation
- Knowledge of security tools like CloudFlare and ForcePoint, including policy review
- Experience with IIS/Dynatrace log analysis, junction configurations, and app failover setup
- Help establish and execute Disaster Recovery (DR) plans and testing
- Familiarity with CDC-hosted applications (preferred)