Lead DevOps Engineer designing cloud infrastructure for ML/AI solutions in medical imaging. Collaborating across teams for scalable, secure platforms that optimize data operations.
Responsibilities
Partner with ML research, data engineering, and application teams to translate requirements into reliable, secure, and cost-effective platform capabilities.
Lead design reviews, RFCs, and proof-of-concepts; mentor team members on cloud, Kubernetes, and data best practices.
Own incident response for platform components and drive continuous improvement through automation and standards.
Design and implement secure, scalable, multi-cloud (GCP + AWS) configurations.
Establish and maintain infrastructure as code (IaC) standards with Terraform.
Lead cloud-to-cloud data migration including secure transfer planning, checksum/manifest validation, parallelization, and cutover strategy.
Implement robust ingestion pipelines for medical images and metadata into structured data stores with schema management, versioning, and data lineage.
Optimize storage tiers and caching strategies for high-throughput image workloads.
Establish cost observability with budgets, alerts, showback/chargeback, and automated idle resource cleanup.
Own permissions and access management across clouds.
Plan and execute winddown and exit from prior cloud providers: data egress, dependency mapping, app cutover, contract/savings plan termination, and archival with retention policies.
Stand up and maintain managed ML platforms (Vertex AI) or managed Kubernetes clusters (GKE/EKS) with CI/CD for pipelines, images, and deployments.
Partner with data/ML teams to codify data management practices: versioned datasets, reproducible preprocessing, clear lineage, and documentation.
Requirements
7+ years in DevOps/SRE/Platform roles, including multi-cloud (AWS/Azure/GCP) experience
Deep proficiency with Terraform, CI/CD (GitHub Actions/GitLab/CodeBuild/Cloud Build), and Kubernetes (EKS/GKE)
Hands-on experience with GPU workloads for ML training/inference and object storage patterns for large image datasets
Proven track record in data migration (cloud-to-cloud), structured data ingestion (e.g., BigQuery/Redshift/Postgres), and schema/governance
Sr. Site Reliability Engineer (SRE) III providing technical solutions for the federal government. Collaborating in a high - performing team focused on reliability and application scalability.
Senior Linux System Engineer developing and maintaining Linux server infrastructure for Th. Geyer GmbH. Collaborating on ERP systems and CI/CD processes while ensuring system performance and security.
Platform Engineer leading the development of cloud application platforms for Allstate. Responsible for cloud infrastructure for ML experimentation and production deployments.
Cloud Platform Engineer (ML DevOps) developing and managing CI/CD pipelines for ML workflows in a leading insurance company. Collaborating with data scientists and ensuring infrastructure security and compliance.
DevOps Engineer developing and managing container platforms for client solutions at Booz Allen Hamilton. Utilizing cloud technologies to enhance capabilities and secure deployments.
Senior DevOps/Platform Engineer automating cloud infrastructure and optimizing delivery pipelines at S&P Global Mobility. Collaborating with teams to enhance product reliability and security.
DevOps Engineer responsible for maintaining and enhancing AWS/EKS platform for energy transition products. Ensuring platform stability, security compliance, and streamlined deployment processes.
Suspension Design and Release Engineer for Ford, impacting vehicle ride, handling, and NVH. Collaborating with cross - functional teams to deliver quality systems and components.
DevOps Engineer at TeamViewer driving DevOps excellence by building CI/CD pipelines and managing Kubernetes. Collaborate within a diverse team to optimize digital processes with cloud infrastructure.
Senior DevOps Engineer managing DevOps processes and tooling for customer - facing platforms at Luminor. Building CI/CD pipelines and providing production support with a focus on mentoring and collaboration.