Principal MLOps Engineer leading design and optimization of machine learning infrastructure at Wood Mackenzie. Collaborating with data science and engineering teams to ensure robust automated ML lifecycles.
Responsibilities
Design, build, and maintain highly scalable, robust, and secure machine learning infrastructure and platforms across the entire organization.
Define and drive the long-term MLOps vision, roadmap, and best practices in alignment with broader business and engineering goals.
Establish and optimize automated CI/CD/CT pipelines for machine learning models, ensuring seamless transitions from research to production.
Oversee the deployment of complex models (including LLMs and deep learning models), optimizing for latency, throughput, and cost-efficiency.
Implement enterprise-grade monitoring, alerting, and logging for model performance, data drift, concept drift, and system health.
Ensure robust AI governance and security compliance.
Partner closely with Data Scientists, Data Engineers, Software Engineers, and Product Managers to bridge the gap between model development and software engineering, developing standardized workflows that accelerate the path to production.
Mentor data scientists in MLOps best practices, foster a culture of engineering excellence, and lead technical design reviews.
Requirements
considerable experience in software engineering, DevOps, or Data Engineering, with dedicated experience in MLOps, ML infrastructure, or deploying ML models at scale.
Deep, hands-on expertise with AWS and its respective managed ML/AI services (SageMaker, Bedrock).
Advanced proficiency with Kubernetes, Docker, and ML-specific orchestration tools like MLFlow.
Strong software development skills in Python, alongside proficiency in languages like C++, or Java for high-performance systems.
Mastery of automation tools (GitHub Actions, GitLab CI, Jenkins, Octopus Deploy) and IaC frameworks (Terraform, Pulumi, Ansible).
Strong understanding of the underlying mechanics of popular ML and deep learning frameworks (PyTorch, TensorFlow, Scikit-Learn) to effectively troubleshoot and optimize deployments.
Demonstrated ability to lead complex, multi-quarter technical initiatives from conception to successful production rollout, including stakeholder management.
AI/ML Engineer applying AI/ML techniques in hardware manufacturing for yield prediction and process improvement. Collaborating on research and deployment of machine learning models.
Senior Machine Learning Engineer developing advanced ML and NLP solutions for Forrester’s conversational AI chatbot. Collaborating with cross - functional teams to deliver scalable, production - ready ML systems.
Machine Learning Engineer designing GPU computing kernels to optimize 3D GenAI models at Meshy. Collaborating with researchers to enhance performance and efficiency in GPU module development.
Senior Software Engineer developing scalable machine learning solutions for product - driven team at Maropost. Collaborating on recommendation systems and enhancing developer experience within the Machine Learning team.
AI Engineer with expertise in Machine Learning for Periferia IT Group. Integrating generative AI models and developing solutions in a hybrid work environment.
Senior Platform/MLOps Engineer designing and maintaining scalable infrastructure for AI at Bright Machines. Join a team transforming manufacturing through intelligent automation.
AI/ML Risk Guide enhancing risk management within Capital One's Tech and Product teams. Collaborating on risk solutions that impact customer experience and stability.
Staff Machine Learning Engineer developing content and creator classification systems for Patreon’s platform insights. Collaborating across teams to enhance discovery and recommendations for creators and fans.
Machine Learning Co - op working on sales and collection AI chatbot projects at Lendbuzz. Gaining experience with data annotation, cleaning, and multilingual model evaluation.
AI Prompt Senior Engineer developing and optimizing large language models for TIAA. Collaborating cross - functionally to create innovative AI solutions with a focus on data science.