Senior SRE Engineer managing cloud infrastructure and driving Infrastructure-as-Code adoption for Resideo. Designing resilient systems while ensuring the health of cloud platforms.
Responsibilities
Maintain public cloud infrastructure by using at least one of the Cloud technology Azure or AWS or Google Cloud (GCP).
Build and Maintain cloud infrastructure automation (IaC) by using Terraform, ARM Templates or similar.
Build and Maintain IT automation using tools like Ansible, Chef or managing complex container-based applications like Helm for Kubernetes.
Build, delivery and deployment by using modern technologies like Git, Git Action, Jenkins, Octopus, Ansible, Docker, Kubernetes or similar.
Build and maintain observability and monitoring across different IT platforms by using Grafana, Prometheus, Elastic, DataDog or similar.
Be part of a L2 team that provides 24/7 support in troubleshooting IT platforms issues, when required (less than 20% of the working time).
Oversee all planned outages, assess RCA and assist with major upgrades to ensure minimum downtime.
Requirements
Minimum 3 years of working experience with at least one of the public cloud platforms. (Azure preferred but not required).
Minimum of 5 years Windows / Linux experience.
Minimum of 2 years Terraform or other IaC platforms experience.
Strong knowledge of Elastic, Grafana, Prometheus or other observability platforms (Datadog, Dynatrace, etc.).
Proven experience with running and/or managing large IT platform services with multiple availability regions.
Experience with container orchestration platform Docker or Kubernetes, or similar.
Strong English communication (written and oral) skills are required.
Benefits
Employment in a strong, well known international company and part of a global team.
Unlimited access to online training.
Flexible hybrid working arrangement to support work-life balance.
Meal ticket for each day worked.
Medical coverage to support your health and wellbeing.
Senior Reliability Engineer applying a variety of reliability techniques and managing projects at Baker Hughes. Collaborating with teams to meet customer expectations and enhance their success.
Staff Site Reliability Engineer managing large - scale systems and ensuring infrastructure reliability for NordVPN's services. Collaborate on automating platforms and solving complex technical challenges.
Site Reliability Engineer responsible for infrastructure performance and reliability at ASAPP, collaborating with product engineering teams and automating processes.
DevOps Technical Lead specializing in automation and CI/CD pipeline management at Stanley Black & Decker. Leading a team to enhance cloud infrastructure within an innovative technology environment.
DevOps Engineer for Vodafone Innovus enhancing DevOps solutions in IoT applications. Collaborating with software, QA, and systems engineers to optimize deployment and continuous integration.
DevOps Engineer accountable for the Salesforce DevOps program at S&P Global. Collaborating with Agile teams, managing releases, and enhancing DevOps processes.
DevSecOps Engineer designing secure cloud infrastructure at CredLens, ensuring best practices in security throughout the development lifecycle. Collaborating with engineering and data teams on dependability and compliance.
Senior Site Reliability Engineer ensuring reliability, scalability, and performance of services at Granicus. Leading automation processes and implementing best practices in site reliability engineering.
Senior Site Reliability Engineer at Coinbase, focusing on identity and access management tooling. Responsibilities include automation, cloud - native development, and maintaining secure system architectures.
Join CORTO as a DevOps Engineer working on AWS infrastructure for enhancing legal tech solutions. Collaborate with a high - achieving team to optimize and support development environments.