DevOps Engineer for designing, automating, and optimizing cloud-native infrastructures across AWS, Azure, and GCP. Collaborating with teams to improve delivery workflows, reliability, and performance.
Responsibilities
Design, build, and maintain cloud-native infrastructures across AWS, Azure, and (optionally) GCP.
Implement scalable, secure, and highly available systems using Kubernetes, Terraform, and CI/CD pipelines.
Automate cloud provisioning and deployments, improve platform reliability, and ensure cost and performance optimization.
Integrate observability tools (Datadog, Grafana, Prometheus, Splunk) into applications and support teams in monitoring and troubleshooting.
Collaborate with developers, QA, and cross-functional teams to enable DevOps practices, streamline workflows, and improve delivery processes.
Support AI/ML workloads by designing infrastructure for training, inference, and MLOps pipelines (SageMaker, Azure ML, Vertex AI).
Maintain documentation, build self-service DevOps tools, and contribute to platform best practices.
Requirements
4+ years of experience in DevOps, SRE, or cloud platform engineering.
Strong expertise in AWS or Azure cloud architectures, networking, and security.
Skilled in Kubernetes (EKS/AKS), Docker, Helm, and modern infrastructure-as-code (Terraform).
Solid understanding of Linux systems, distributed systems, and scalable architecture design.
Hands-on experience with CI/CD tools (Jenkins, GitHub Actions, Azure DevOps) and GitOps (ArgoCD).
Comfortable with observability tooling (Datadog, Splunk, Prometheus, Grafana).
Experience with AI/ML platforms or ML-driven workloads is a strong plus.
Ability to work well with cross-functional teams, communicate clearly, and enjoy building reliable, automated, developer-friendly platforms.
Senior Reliability Engineer applying a variety of reliability techniques and managing projects at Baker Hughes. Collaborating with teams to meet customer expectations and enhance their success.
Staff Site Reliability Engineer managing large - scale systems and ensuring infrastructure reliability for NordVPN's services. Collaborate on automating platforms and solving complex technical challenges.
Site Reliability Engineer responsible for infrastructure performance and reliability at ASAPP, collaborating with product engineering teams and automating processes.
DevOps Technical Lead specializing in automation and CI/CD pipeline management at Stanley Black & Decker. Leading a team to enhance cloud infrastructure within an innovative technology environment.
DevOps Engineer for Vodafone Innovus enhancing DevOps solutions in IoT applications. Collaborating with software, QA, and systems engineers to optimize deployment and continuous integration.
DevOps Engineer accountable for the Salesforce DevOps program at S&P Global. Collaborating with Agile teams, managing releases, and enhancing DevOps processes.
DevSecOps Engineer designing secure cloud infrastructure at CredLens, ensuring best practices in security throughout the development lifecycle. Collaborating with engineering and data teams on dependability and compliance.
Senior Site Reliability Engineer ensuring reliability, scalability, and performance of services at Granicus. Leading automation processes and implementing best practices in site reliability engineering.
Senior Site Reliability Engineer at Coinbase, focusing on identity and access management tooling. Responsibilities include automation, cloud - native development, and maintaining secure system architectures.
Join CORTO as a DevOps Engineer working on AWS infrastructure for enhancing legal tech solutions. Collaborate with a high - achieving team to optimize and support development environments.