Apply for this Job
(Big Data) Sr System Reliability Engineer
St Louis, Missouri (Hybrid, Resource is required in St Louis and ready to travel to Missouri office whenever required)
Skillset required: ITSM, Production Support, Hadoop, Hive, Spark, Nifi, Impala,
Secondary Skill: Git/BitBucket, Jenkins, Maven, Artifactory, and Chef
The Role
• Plan, manage, and oversee all aspects of a Production Environment for Big Data Platforms.
• Define strategies for Application Performance Monitoring, Optimization in Prod environment
• Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
• Ensures that batch production scheduling and process are accurate and timely.
• Able to create and execute queries to big data platform and relational data tables to identify process issues or to perform mass updates, preferred.
• Performs ad hoc requests from users such as data research, file manipulation/transfer, research of process issues, etc.
• Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize mean time to recover.
• Engage in and improve the whole lifecycle of services-from inception and design, through deployment, operation and refinement.
• Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
• Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
• Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead Mastercard in DevOps automation and best practices.
• Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Work with a global team spread across tech hubs in multiple geographies and time zones.
• Ability to share knowledge and explain processes and procedures to others.
Requirements
• Experience in the Big Data technologies (Hadoop, Spark, Nifi, Impala)
• Experience with performing data analysis, data observability, data ingestion and data integration.
• 5+ years of relevant data engineering, data infrastructure, DataOps, DevOps, SRE, or general systems engineering experience.
• 5+ years of Experience in running Big Data production systems.
• 1+ years of Hands-on experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, Maven, Artifactory, and Chef.
• Experience architecting and implementing data governance processes and tooling (such as data catalogs, lineage tools, role-based access control)
• Solid grasp of SQL fundamentals
• Experience with algorithms, data structures, scripting, pipeline management, and software design.
• Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
• Ability to help debug and optimize code and automate routine tasks.
• Ability to support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
• Appetite for change and pushing the boundaries of what can be done with automation.
• Experience in working across development, operations, and product teams to prioritize needs and to build relationships is a must.
• Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort is desired.
• Good Handle on Change Management and Release Management aspects of Software.
Additional Remarks: Resource is required in St Louis and ready to travel to Missouri office whenever required
Date Posted: 24 March 2025
Apply for this Job