MLOps Engineer building and operating scalable ML training & serving infrastructure for Epidemic Sound’s music search, recommendation, and audio ML systems.
Responsibilities
Design, build, and maintain the core infrastructure that powers machine learning applications.
Streamline the entire ML lifecycle and implement next-generation technologies.
Build scalable infrastructure for training and serving machine learning models using Kubernetes (GKE).
Develop and optimize CI/CD pipelines to streamline ML application lifecycle from development to production.
Implement and manage robust ML monitoring and observability solutions to ensure production model reliability.
Collaborate with Machine Learning Engineers, Data Engineers, and product teams to integrate data pipelines and tools like Vertex AI and feature stores.
Work within a team of MLOps engineers inside a larger cross-functional group.
Requirements
Proven experience in MLOps, with a deep understanding of best practices like ML monitoring and CI/CD for machine learning.
Proficiency with Kubernetes in a production environment.
Hands-on experience with pipeline orchestration tools such as Vertex AI Pipelines, Kubeflow Pipelines, Flyte, or Metaflow.
Infrastructure as Code skills, particularly with Terraform.
Experience with cloud-native data processing services like Dataflow or Airflow.
Nice to have: Experience with Google Cloud Platform services like BigQuery and Google Cloud Storage.
Nice to have: Knowledge of advanced data engineering practices.
Nice to have: Familiarity with observability tools for production infrastructure (e.g., Grafana, Prometheus, OpenTelemetry).
Nice to have: Experience with serverless inference frameworks such as Seldon Core.
Nice to have: Familiarity with Music Information Retrieval.
Machine Learning Engineer developing advanced SLAM systems for autonomous trucking environments at Bot Auto. Collaborating with cross - functional teams to optimize mapping solutions and ensure operational stability.
Graduate Deep Learning Algorithm Developer developing perception technologies for autonomous driving. Tackling challenges in object detection and 3D perception using state - of - the - art deep learning models.
Principal AI/ML Engineer leading the AI/ML infrastructure development for WEX's risk service needs. Focused on innovative engineering and technology solutions within a high - stakes environment.
AI/ML Engineer developing solutions in artificial intelligence for HPE. Responsible for conducting research, designing AI solutions, and mentoring team members.
Machine Learning Engineer focusing on modeling cancer cells and developing related tools. Collaborating with researchers and scientists to advance cancer treatment through ML.
Machine Learning Engineer II developing production - grade ML models for fraud detection at GEICO. Collaborating on system architecture and ensuring optimal performance of fraud assessment systems.
AI/ML Engineer III designing and architecting AI solutions at Hewlett Packard Enterprise. Collaborating with teams to drive innovation and tackle complex problems.
AI/ML Engineer deploying state - of - the - art AI models to solve real - world problems at Brain Co. Working in healthcare, government, and energy sectors for impactful results.
Trainer at WeAndTheMany facilitating learning by sharing experiences and creating interactive sessions. Engaging with students to enhance their skills and knowledge through dynamic teaching methods.
Machine Learning Manager leading experienced team to drive data - driven AI/ML solutions at Ford. Overseeing strategies for product development focused on analytics in various domains.