Hybrid MLOps Engineer

Posted 1 hour ago

Apply now

About the role

  • MLOps Engineer responsible for managing PyTorch-based training and inference workloads at Menlo HQ. Building and maintaining robust infrastructure for AI models and optimization processes.

Responsibilities

  • Own and evolve the infrastructure behind PyTorch-based training and inference workloads
  • Build and maintain training and inference pipelines using PyTorch
  • Own and evolve inference serving infrastructure
  • Write and maintain robust tooling in Python and C++
  • Optimize compute workloads for bare-metal environments
  • Troubleshoot low-level networking issues
  • Set up and manage ML environments
  • Establish CI/CD patterns for AI workloads
  • Integrate monitoring, alerting, and incident response

Requirements

  • Deep expertise in PyTorch internals
  • Strong programming skills in Python and C++
  • Solid computer science fundamentals
  • Hands-on experience with vLLM and SGLang
  • Experience with RLHF and PPO training pipelines
  • Strong understanding of distributed training setups
  • Experience debugging and tuning bare-metal Linux servers
  • Familiarity with job schedulers such as Airflow
  • Strong grasp of containerized and cloud-native environments

Benefits

  • Flexibility in work arrangements
  • Professional development opportunities

Job title

MLOps Engineer

Job type

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job