About the role

Senior ML Platform Engineer at Mistplay researching and developing machine learning solutions. Collaborating with teams to solve complex business problems and enhance mobile gaming experience.

Responsibilities

Design, build, and operate standardized training-to-deployment pipelines with Airflow, covering artifact management, environment provisioning, packaging, deployment, and rollback for SageMaker endpoints.
Own real-time and batch inference on SageMaker: multi-model endpoints (MME), serverless inference where appropriate, blue/green and canary deployment strategies, autoscaling policies, and cost controls (spot strategies, instance sizing).
Implement very low-latency service models using Redis/Valkey: feature caching, online feature retrieval, request-level state, model response caching, and rate limiting/backpressure for bursty traffic.
Provision and manage ML/data infrastructure with Terraform: SageMaker endpoints/configurations, ECR/ECS/EKS resources, network endpoints/VPCs, ElastiCache/Valkey clusters, observability stacks, secrets, and IAM.
Build platform abstractions and golden paths: Airflow DAG templates, CLI/SDK, cookie-cutter repositories, and CI/CD pipelines that move models from notebooks to production predictably.
Establish and manage model lifecycle governance: model/feature registries, approval workflows, promotion policies, lineage and audit trails integrated with Airflow runs and Terraform state.
Implement end-to-end observability: data/feature freshness checks, drift/quality controls, model performance/latency SLOs, infrastructure health dashboards, tracing and alerts, plus incident response and postmortems.
Collaborate with security, SRE, and data engineering teams on private networks, policy-as-code, handling of PII, least-privilege IAM, and cost-effective architectures across environments.
Evaluate, integrate, and rationalize platform tooling (e.g., MLflow registry, feature stores, service gateways); lead migrations with clear change management and minimal downtime.

Requirements

5+ years of experience building and operating production-grade ML/data platforms focused on service, reliability, and developer experience.
Strong software engineering skills in Python, Go, or Java; experience building resilient services, APIs, and automation tools with high test coverage.
Deep experience with AWS SageMaker inference: endpoint configuration, containerization, model packaging, autoscaling, trade-offs between serverless and real-time, MME, A/B and canary releases.
Expertise with online feature stores such as Redis/Valkey in ML service contexts.
Proven Terraform experience for end-to-end ML and data infrastructure management: modules, workspaces, drift detection, change review, and safe rollbacks; familiarity with GitOps patterns.
Large-scale Airflow orchestration: dependency modeling, sensors, retries, SLAs, backfills, DAG factories, and integrations with registries, artifact stores, and Terraform pipelines.
Familiarity with ML frameworks (scikit-learn, XGBoost, PyTorch, TensorFlow) from a platform integration perspective to support diverse runtimes and containers.
Observability for ML workflows: metrics/logs/traces, performance profiling, capacity planning, cost monitoring, and runbooks.
Excellent cross-functional communication and collaboration with data science, data engineering, DevOps, and backend teams.

Benefits

Team lunches
Game nights
Company-wide events

Hybrid Senior ML Platform Engineer II

at Mistplay

About the role

Responsibilities

Requirements

Benefits

Job title

Job type

Experience level

Salary

Degree requirement

Tech skills

Location requirements

Report this job

Similar roles

Infrastructure Engineer – Foundation

Pylon

Infrastructure Systems Engineer II

Conduent

Operational Technology Cybersecurity Specialist

PwC

Ingeniero de Infraestructura y Seguridad, Temporal

CRG Solutions

Senior Infrastructure Engineer – Virtualization, Windows

Work Life Group

Cloud Infrastructure Support Engineer

Arthur Cox LLP

Infrastructure Engineer – Database

Aircall

Lead Cloud Infrastructure Engineer

Paramount

Lead Infrastructure Engineer

Sentinel Technologies

Data Cloud & Infrastructure Architect

Bidcom