Staff HPC and Infrastructure Administrator with Security Clearance

Greenbelt, Maryland

Varada Consulting
Job Expired - Click here to search for similar jobs
Staff HPC and Infrastructure Administrator
Clearance: US Citizenship is required / Ability to obtain a Public Trust
Location: Goddard Space Flight Center, Greenbelt, MD OR AMES Research Center, Mountain View, CA
Mon-Fri Regular Business hours, Hybrid 3 days onsite/2 days remote Overview:
Varada Consulting proudly supports NASA's High Performance Computing Services program in Mountain View, CA at the Ames Research Center and in Greenbelt, MD at Goddard Space Flight Center. Make a DIFFERENCE on a program that supports 4 On-site Supercomputers 18,000+ nodes, 17+ combined petaflop supercomputer systems. We have an immediate position for a Staff HPC and Infrastructure Administrator will be supporting the Technical Systems Manager supporting HPC Infrastructure and Ancillary Infrastructure Systems. An individual at this skill level should have demonstrated extensive experience working with large HPC clusters from top name vendors to maintain and manage HPC resources. The individual will be engaged in the day-to-day operations and support of the infrastructure resources. Activities may include system patching, OS upgrades, deploying new systems, writing scripts, and troubleshooting system issues. The ability to interact with users to determine symptoms and then reproduce their issues to isolate the causes are critical skills for this work. There will also be activities in testing, benchmarking, user tool scripting, and analyzing trouble tickets to find patterns indicating system, or user education issues. Responsibilities:

• Supercomputing and Infrastructure System Administration that contributes to:

• Installation, provisioning, and/or rebuilding systems in both HPC and infrastructure environments

• Patching of assigned systems to NASA requirements

• Maintain, extend, and develop customized scripts to support user, monitoring, and general system administration

• Day-to-day operational escalation of the Linux HPC clusters and storage systems

• Proactive monitoring, analyze, and correct system issues

• Development of scripts to automate repetitive tasks or tools to enhance support of the HPC and infrastructure systems

• System performance analysis and tuning

• Building, installing, and supporting user-requested software

• Supporting evaluation and assessment of new HPC technology

• Resolving user report issues and manage support tickets requests in Remedy

• Staff support resource for a myriad of HPC and Infrastructure Systems

• Operationalize completed projects appropriate knowledge transfer to tiered support groups

• Work extensively with HPC vendors and architects on bug fixes, kernel updates, and feature releases

• Apply best practices in systems engineering, delivering projects on time, on budget, with excellent quality

• After hours/weekend support as required Requirements:

• Bachelor's degree in computer science or related field

• Strong computer science background with in-depth systems-level knowledge in operating systems and networking

• A minimum of 3 years of Systems Engineering and Integration experience in heterogeneous, multi-platform HPC and Linux environments

• Solid understanding of the systems engineering process, including requirements, use cases, design, documentation, and testing in a Linux environment

• Demonstrated equivalence of 3 years of Linux/UNIX user support and hands-on experience with administration of Linux systems

• Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash

• Excellent communication and people skills; excellent time management and organizational skills

• Experience with system configuration management tools e.g., puppet, chef, ansible

• Experience with revision control software e.g., CVS, SVN, Git

• Proficiency at technical writing Preferred Skills:

• Familiarity/proficiency with OpenMP and Message Passing Interface (MPI) programming

• Experience with Lustre and InfiniBand

• Experience with HPC schedulers (PBS, Slurm, Moab/Torque, SGE) Join an Award - Winning Team. Voted as Most Innovative and Fastest Growing Company, Varada Consulting offers highly customized IT capabilities in the federal civilian and DoD market space in support of the mission objectives of the federal government. Varada provides competitive compensation and benefits packages including 100% employer paid healthcare premium. Varada Consulting, LLC is an Equal Employment Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.
Date Posted: 22 April 2024
Job Expired - Click here to search for similar jobs