Develop monitoring solutions to provide complete system coverage
Strategically plan and organize team deliverables
Practice and document disaster recovery scenarios
Streamline our software build and release pipeline
Engage in post-incident reviews to learn from failures and solve the root cause
Able to be available for incident escalations
Requirements
Bachelor’s degree in Computer Science, Engineering, or a related field.
+10 years of relevant experience in related roles.
AWS Certified Solutions Architect – Professional certification
Expertise in scripting languages such as Python, Bash, and PowerShell
Advanced experience with infrastructure-as-code tools, including Terraform and CloudFormation
Strong time-management skills with the ability to prioritize tasks and meet deadlines
Excellent troubleshooting abilities to diagnose and resolve complex issues efficiently
Clear and effective communication skills, especially when explaining complex technical concepts
Proven leadership experience with the ability to guide and mentor a team of engineers.
Benefits
Health & Wellness: Health care coverage designed for the mind and body.
Flexible Downtime: Generous time off helps keep you energized for your time on.
Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.
DevOps Product Manager working on complex platform and infrastructure projects. Consulting on DevOps best practices and ensuring scalable, efficient digital ecosystems for clients.
Site Reliability Engineer optimizing large - scale Linux environments at Bumble Inc. Troubleshooting incidents and driving performance improvements on platforms such as Kafka and Kubernetes.
Senior DevOps Engineer at mylo, managing multi - cloud infrastructure and CI/CD pipelines. Promoting DevOps culture while ensuring compliance and automating system maintenance.
Site Reliability Engineer responsible for monitoring and improving the reliability of satellite operations infrastructure. Collaborating with teams to automate processes in a dynamic environment.
DevOps Analyst providing high quality and reliable solutions within multifuncional teams at technology - focused financial organization. Automating build and deployment solutions in a hybrid work environment.
Network & Datacenter Deployment Engineer at Cloudflare focused on building and expanding their global network infrastructure with collaboration across multiple engineering teams and vendors.
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.
Platform Engineer focusing on supporting CI/CD pipelines and Kubernetes at PCCW. Responsible for ensuring platform services' reliability and performance, with night - time support as needed.