Lead DevOps Engineer designing cloud infrastructure for ML/AI solutions in medical imaging. Collaborating across teams for scalable, secure platforms that optimize data operations.
Responsibilities
Partner with ML research, data engineering, and application teams to translate requirements into reliable, secure, and cost-effective platform capabilities.
Lead design reviews, RFCs, and proof-of-concepts; mentor team members on cloud, Kubernetes, and data best practices.
Own incident response for platform components and drive continuous improvement through automation and standards.
Design and implement secure, scalable, multi-cloud (GCP + AWS) configurations.
Establish and maintain infrastructure as code (IaC) standards with Terraform.
Lead cloud-to-cloud data migration including secure transfer planning, checksum/manifest validation, parallelization, and cutover strategy.
Implement robust ingestion pipelines for medical images and metadata into structured data stores with schema management, versioning, and data lineage.
Optimize storage tiers and caching strategies for high-throughput image workloads.
Establish cost observability with budgets, alerts, showback/chargeback, and automated idle resource cleanup.
Own permissions and access management across clouds.
Plan and execute winddown and exit from prior cloud providers: data egress, dependency mapping, app cutover, contract/savings plan termination, and archival with retention policies.
Stand up and maintain managed ML platforms (Vertex AI) or managed Kubernetes clusters (GKE/EKS) with CI/CD for pipelines, images, and deployments.
Partner with data/ML teams to codify data management practices: versioned datasets, reproducible preprocessing, clear lineage, and documentation.
Requirements
7+ years in DevOps/SRE/Platform roles, including multi-cloud (AWS/Azure/GCP) experience
Deep proficiency with Terraform, CI/CD (GitHub Actions/GitLab/CodeBuild/Cloud Build), and Kubernetes (EKS/GKE)
Hands-on experience with GPU workloads for ML training/inference and object storage patterns for large image datasets
Proven track record in data migration (cloud-to-cloud), structured data ingestion (e.g., BigQuery/Redshift/Postgres), and schema/governance
Mechanical/Reliability Engineer responsible for mechanical installations in Bergen op Zoom. Analyzing maintenance strategies and leading projects to enhance reliability.
Senior DevOps Engineer responsible for cloud infrastructure and deployments. Optimizing AWS services and ensuring system security and reliability for Verizon.
Senior DevOps Engineer responsible for automating infrastructure and building CI/CD pipelines for collaborative robotics company. Collaborating with global engineering teams from the Bangalore office.
Site Reliability Engineer Intern at Tencent working on gaming services and cloud native solutions. Collaborating with global teams to eliminate toil and enhance reliability.
Cloud/DevOps Specialist at N5X managing and optimizing critical cloud infrastructures for Brazilian energy trading. Collaborating with a multidisciplinary team to ensure high availability and performance.
Cloud/Devops Specialist responsible for designing a hybrid architecture combining cloud and on - premises infrastructure for energy trading systems. Collaborating with a multidisciplinary team in a dynamic environment.
Reliability Engineering Specialist utilizing reliability tools and models to improve asset performance at Enbridge. Collaborating across teams to guide investment decisions for safe operations.
DevOps Engineer responsible for structuring and supporting cloud DevOps architecture in Brazil. Working strategically on automation and CI/CD practices with development teams in Pernambuco.
DevSecOps Software Engineer developing secure CI/CD pipelines for Boeing's military software systems. Collaborate with cross - functional teams and implement automation and security best practices.
DevOps Manager responsible for managing a team for multi - cloud solutions supporting the USAF Cloud One project. Focus on scalable cloud - native solutions and CI/CD practices.