Site Reliability Engineer managing multi-cloud infrastructure for S&P Global's Mobility Division. Collaborating with IT and Automotive teams while setting up infrastructure for cloud migration and expansion.
Responsibilities
Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
Partner with development teams to improve services through rigorous testing and release procedures.
Participate in system design consulting, platform management, and capacity planning.
Create sustainable systems and services through automation.
Balance feature development speed and reliability with well-defined service-level objective.
Day to day management of AWS/Azure Infrastructure.
Build and document automation processes for Infrastructure as a Service/Infrastructure as code.
Backup and Patch management.
Requirements
Bachelor’s degree (or equivalent) in computer science or related discipline with at least 7 years of experience
Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
Strong interpersonal skills, analytical and problem-solving ability along with strong written and verbal communication.
Solid understanding and hands-on experience with EKS.
Hands-on experience with CI/CD pipelines and DevOps tooling, including Git-based version control (GitLab preferred), pipeline design and maintenance, automated builds, testing, and deployments for cloud-native and containerized workloads.
Ability to communicate ideas in both technical and non-technical ways.
A strong capacity for teamwork and a sense of ownership and able to work independently and be self-driven.
Hands on Experience with Linux Server, AD, LDAP, DNS, Network Storage, AWS Compute services (EC2, FSX, Managed AD, Route 53, etc…)
Ability to program using scripting with tools or languages, such as PowerShell, Python, Ansible, Terraform and Bash
Familiarity with ITSM processes like Incident, Problem and Change Management using ServiceNow (preferable).
Benefits
Health & Wellness: Health care coverage designed for the mind and body.
Flexible Downtime: Generous time off helps keep you energized for your time on.
Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.
Senior DevOps Engineer at Elliptic shaping DevOps culture and driving automation across engineering teams, providing expertise and leadership across the stack.
Senior Data Reliability Engineer ensuring software reliability and quality across enterprise applications. Collaborating with teams to implement robust on - call processes and maintain data fidelity.
Infrastructure & Cloud Operations Engineer managing AWS and hybrid environments for CV - Library. Hands - on role focused on reliability, automation, and operational excellence.
Site Reliability Engineer building reliable and scalable infrastructure for fintech startup Pave Bank. Collaborating with internal teams to enhance banking platform performance and reliability.
Lead DevOps Engineer managing DevOps projects for high - quality strategy games at Twin Harbour Interactive. Collaborating with teams to optimize production systems and improve development workflows.
Software Engineer contributing to the observability team's development of visibility systems. Implementing a high - performance telemetry platform and supporting AI tools for engineering teams.
Senior DevOps Platform Engineer at Humana designing secure cloud infrastructure for healthcare technology. Responsible for CI/CD pipelines and compliance in regulated environments.
Site Reliability Engineer working on the post - RPA Agentic Automation Platform for enterprises. Responsible for developing scalable systems and improving operational reliability.
Cloud Operations Engineer handling advanced troubleshooting and system administration for secure cloud environments. Operating compliance controlled cloud environments and maintaining system stability.