MLOps Engineer at Aerones | Hybrid Hired

About the role

Own the end-to-end ML pipeline for computer vision: data prep, training, evaluation, model packaging, artifact/version management, deployment, and monitoring (local GPU cluster + GCP).
Design and maintain containerized workflows for multi-GPU training and distributed workloads (e.g., PyTorch DDP, Ray, or similar).
Build and operate orchestration (e.g., Airflow/Argo/Kubeflow/Ray Jobs) for scheduled and on-demand pipelines across on-prem and cloud.
Implement and tune resource allocation strategies based on current and upcoming task queues (GPU/CPU/memory-aware scheduling; preemption/priority; autoscaling).
Introduce and integrate monitoring/telemetry for:
job health and failure analysis (retry, backoff, alerts),
data/feature drift and model performance (precision/recall, latency, throughput),
infra metrics (GPU utilization, memory, I/O, cost).
Harden GCP environments (permissions, networks, registries, storage) and optimize for reliability, performance, and cost (spot/managed instance groups, autoscaling).
Establish model governance: experiment tracking, model registry, promotion gates, rollbacks, and audit trails.
Standardize CI/CD for ML (data/feature pipelines, model builds, tests, and canary/blue-green rollouts).
Collaborate with CV researchers/engineers to productionize new models and improve training throughput & inference SLAs.
Continuously improve documentation: update existing pipeline docs and produce concise runbooks, diagrams, and “how-to” guides.

Requirements

Hands-on MLOps experience building and running ML pipelines at scale (preferably computer vision) across on-prem GPUs and a public cloud (GCP preferred).
Strong with Docker and Docker Compose in local and cloud environments; solid understanding of image build optimization and artifact caching.
GitLab CI/CD expertise (modular templates, YAML optimization, build/test stages for ML, environment promotion).
Proficiency with Python and Bash for pipeline tooling, glue code, and automation; Terraform for infra-as-code (GCP resources, IAM, networking, storage).
Experience with orchestration: one or more of Airflow, Argo Workflows, Kubeflow, Ray, or Prefect.
Experience operating GPU workloads: NVIDIA driver/CUDA stack, container runtimes, device plugins (k8s), multi-GPU training, utilization tuning.
Observability & monitoring for ML and infra: Prometheus/Grafana, OpenTelemetry/Loki (or similar) for metrics, logs, traces; alerting and SLOs.
Experiment tracking / model registry with tools like MLflow or Weights & Biases (runs, params, artifacts, metrics, registry/promotion).
Data versioning & validation: DVC/lakeFS (or similar), Great Expectations/whylogs, schema checks, drift detection.
Cloud services: GCP (Compute Engine, GKE or Autopilot, Cloud Run, Artifact Registry, Cloud Storage, Pub/Sub). Equivalent AWS/Azure experience is acceptable.
Security & compliance for ML stacks: secrets management, SBOM/image scanning, least-privilege IAM, network policies, artifact signing.
Solid understanding of containerized deployment patterns (blue-green/canary), rollout strategies, and rollback safety.

Benefits

Salary from **2,500 EUR to 5,500 EUR per month** (before Taxes)
A Birthday Gift
**After Probationary Period **
**Health Insurance**
**Health Recovery Days **(which can be taken as you need)
Paid **Study Leave**
Funding for the purchase of **Vision Glasses **after one (1) year of service

Similar roles

Browse all Machine Learning Engineer jobs

5 hours ago

GE

GetinzAI/ML Engineer

Innovation Engineer responsible for AI - driven solutions at a digital commerce company. Focused on prototyping, exploring technologies, and shaping technology strategy.

Hybrid Role

Bengaluru India Machine Learning Engineer

8 hours ago

TU

TubiSenior Machine Learning Engineer

Senior ML Engineer developing scalable machine learning systems for FOX advertising platform. Collaborating on ML solutions that optimize ad personalization and monetization.

Hybrid Role

Toronto Canada Machine Learning Engineer

8 hours ago

RI

RiverlaneSenior AI / ML Engineer

Senior AI/ML Engineer developing machine learning tools for quantum error correction at Riverlane. Collaborating with researchers to deliver innovative AI solutions in quantum computing.

Hybrid Role

Delft Netherlands Machine Learning Engineer

€70,000 - €90,000 per year

9 hours ago

TD

TDApplied Machine Learning Scientist I, Generative AI Model Validation

Applied Machine Learning Scientist validating Generative AI models for TD. Responsible for model validation and communicating findings to stakeholders while fostering collaborations.

Hybrid Role

Toronto Canada Machine Learning Engineer

CA$105,500 - CA$125,000 per year

yesterday

PL

PlanetSenior Software Engineer, Machine Learning

Senior Software Engineer developing machine learning geospatial products for Planet. Collaborating with engineers and scientists on innovative remote sensing analytics.

Hybrid Role

United States Machine Learning Engineer

$142,800 - $178,500 per year

yesterday

PA

PartSpaceMachine Learning Engineer – m/f/d

Machine Learning Engineer responsible for optimizing AI pipelines at Easy2Parts. Join a growing team to revolutionize component sourcing with AI technology.

Hybrid Role

Nuremberg Germany Machine Learning Engineer

2 days ago

NO

NokiaAI/ML Engineer

AI/ML Engineer developing and deploying machine learning solutions for Nokia's network optimization projects. Collaborating with cross - functional teams to enhance network planning capabilities.

Hybrid Role

Sao Paulo Brazil Machine Learning Engineer

2 days ago

CO

CoinbaseSoftware Engineer – Machine Learning Platform

Machine Learning Platform Engineer for Coinbase, building foundational components for ML at scale. Collaborating on fraud combat, personalizing user experiences, and blockchain analysis.

Hybrid Role

United States Machine Learning Engineer

$152,405 - $179,300 per year

2 days ago

CO

CoinbaseMachine Learning Engineer, Risk AI/ML

Machine Learning Engineer focused on building sophisticated models to protect Coinbase users from fraud. Engaging in hands - on technical role with modern AI/ML methodologies.

Hybrid Role

United States Machine Learning Engineer

$152,405 - $179,300 per year

2 days ago

GE

GEICOSenior Software Engineer – AI/ML Infra

Senior ML Platform Engineer developing and maintaining scalable ML infrastructure at GEICO. Focused on Large Language Models and collaborating with data science and engineering teams.

Hybrid Role

Chevy Chase United States Machine Learning Engineer

$105,000 - $300,000 per year