Principal Systems Engineer

Atlanta, Georgia

World Wide Technology
Apply for this Job

Principal Systems Engineer

Atlanta, GA

12 Months

Position Overview: We are seeking a skilled Principal Systems Engineer to join our team and lead Reliability improvement efforts across our enterprise, with particular focus on network infrastructure and systems. As a Principal Systems Engineer - Reliability, you will be instrumental in defining and coordinating reliability enhancement initiatives across our network. Leveraging your expertise in systems engineering and reliability analysis, you will drive strategic initiatives to minimize downtime, optimize performance, and enhance the overall reliability of our systems to significantly improve the customer experience.

Key Responsibilities:

  • Reliability Strategy: Develop and execute a comprehensive reliability strategy for our enterprise systems, including network infrastructure, server platforms, and software applications.
  • Enterprise Coordination: Collaborate with cross-functional teams to prioritize and implement reliability improvement efforts across the organization, ensuring alignment with business objectives and industry standards.
  • Root Cause Analysis: Define scalable processes for root cause analysis investigations for system failures and performance degradation, identifying underlying issues and implementing corrective actions to prevent recurrence.
  • Risk Management: Assess potential risks to system reliability, such as hardware failures, software bugs, and configuration errors. Develop and implement risk mitigation strategies to enhance system resilience.
  • Performance Monitoring: Establish robust monitoring systems to track system performance metrics and reliability indicators. Analyze data to identify trends, anticipate potential issues, and proactively address reliability challenges.
  • Continuous Improvement: Drive a culture of continuous improvement by identifying opportunities for process optimization, automation, and efficiency gains in reliability enhancement initiatives.
  • Vendor Management: Collaborate with vendors and suppliers to ensure the reliability of system components and software. Evaluate vendor performance and provide feedback to drive product improvements.
  • Documentation and Reporting: Maintain comprehensive documentation of reliability improvement efforts, including procedures, policies, and incident reports. Prepare regular reports and presentations for senior leadership, highlighting progress and key performance metrics.

Qualifications:

  • Bachelor's degree in Engineering or a related field; Master's degree preferred.
  • Minimum of 10 years of experience in systems engineering, reliability engineering, or a related role within the telecommunications industry.
  • Proven track record of defining and coordinating enterprise-wide reliability improvement initiatives, with a focus on system and network infrastructure.
  • Deep understanding of system technologies, including network architecture, server platforms, and software applications.
  • Strong analytical and problem-solving skills, with the ability to conduct root cause analysis and drive effective solutions.
  • Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams and senior executives.
  • Experience with reliability engineering tools and methodologies, such as FMEA, RCM, and fault tree analysis, is preferred.
  • Relevant certifications such as CISSP, ITIL, or Six Sigma Black Belt are a plus.
Date Posted: 09 May 2024
Apply for this Job