Service Management professional providing technical application support and Cloud Infrastructure Management for S&P Global's Markets group. Leading service excellence culture and major incident management.
Responsibilities
Provide end-to-end ownership of Incident, Problem, Change, and Business Continuity processes, ensuring predictable, high-quality service delivery to internal and external customers.
Operate as the primary escalation authority for complex, high-impact production issues, coordinating across engineering, cloud, security, and vendor teams.
Partner closely with Product, Architecture, and Delivery teams to ensure operational readiness for releases, embedding reliability, supportability, and resilience early in the design lifecycle.
Drive continuous improvement initiatives across monitoring, alerting, reporting, automation, and operational maturity.
Embed AI/ML-driven operations (AIOps) to enhance anomaly detection, predictive alerting, intelligent noise reduction, and proactive incident prevention.
Influence and support technology governance, risk management, compliance, and audit activities related to service reliability.
Ensure 24x7 proactive monitoring and management of business-critical platforms, restoring service rapidly and minimizing customer impact.
Define and enforce incident severity models, ensuring accurate impact assessment, prioritization, and stakeholder communication.
Maintain end-to-end ownership of incidents, including those requiring third-line engineering or formal change execution.
Provide clear, consistent, and executive-level communication during incidents, outages, and service degradation.
Oversee application support spanning infrastructure, data remediation, user queries, education, and deep-dive incident investigations.
Drive observability across events, alerts, batch jobs, capacity planning, and performance KPIs, translating insights into actionable change.
Collaborate with functional and technical teams to ensure future deliverables (functional and non-functional) are operationally viable.
Champion knowledge management, ensuring high-quality runbooks, SOPs, and operational documentation in Confluence.
Deliver against SLA, OLA, and SLO commitments, with transparent reporting and corrective actions.
Leverage AIOps and reliability analytics to identify trends, systemic risks, and optimization opportunities at scale.
Requirements
Bachelor’s or Master’s degree in Computer Science, Engineering, or related discipline.
Ideally 10-12+ years of progressive experience in SRE, DevOps, Platform Engineering, or Technology Operations, including leadership responsibility.
Proven experience designing and operating high-availability, disaster-recovery, and incident response capabilities across AWS, Azure, or GCP.
Strong understanding of ITIL-aligned Service Management processes and enterprise operational governance.
Deep expertise with observability platforms such as Splunk, CloudWatch, Prometheus, Grafana, Datadog, or equivalent.
Strong database expertise (Oracle / PostgreSQL), including advanced SQL tuning, performance optimization, and operational troubleshooting.
Demonstrated experience leading post-incident reviews and driving preventative engineering outcomes.
Excellent decision-making and leadership capabilities under high-pressure, executive-visible incidents.
Strong knowledge of Linux and Windows operating systems, automation, and scripting (Python preferred).
Solid understanding of SDLC, Agile methodologies, defect triage, and engineering collaboration models.
Prior experience in Financial Services and/or S&P Global technology platforms is highly desirable.
Right to Work Requirements: This role is limited to persons with the indefinite right to work in the United States.
Benefits
Health & Wellness: Health care coverage designed for the mind and body.
Flexible Downtime: Generous time off helps keep you energized for your time on.
Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in-class benefits for families.
Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.
Facilities Operations Supervisor overseeing day - to - day projects at Aramark. Ensuring team goals and deadlines are met while managing equipment and reporting.
Team assistant in Operations for a startup focused on renewable energy. Organizing project schedules and maintaining communication with subcontractors and internal teams.
Director of ACP Operations at Aviso managing operations teams and fostering relationships within financial services. Leading strategies to enhance performance, compliance, and stakeholder partnerships in a dynamic environment.
Health, Safety, Environment and Quality Coordinator overseeing HSEQ governance processes for North America. Leading a team to ensure compliance with safety and quality standards while providing operational support.
Field Operations Trainee at Otis learning about different divisions and developing leadership skills. Working in New Construction, Service & Repair, and Modernization during a 6 - month training program.
Senior Privacy Operations Specialist managing privacy compliance for Vodafone’s global customers. Supporting privacy initiatives and ensuring data protection for millions globally.
Intern in People Operations/HR at etalytics, focusing on digitalizing HR processes and supporting recruiting efforts. Join a motivated team to shape the future of work.
TechOps Specialist managing internal IT operations and user support for Dots Africa. Focus on administering Microsoft 365 and supporting tech operations in a hybrid environment.
Process Improvement Consultant II driving clinical workflow optimization and operational processes at OU Health. Collaborating with healthcare teams to enhance quality, safety, and efficiency.
Assembly Virtual Operations Leader at GE Aerospace managing real - time engine assembly processes and parts shortages to ensure timely shipments. Collaborating with various departments and data collection for decision making.