Hybrid Machine Learning Operations Engineer II

Posted 16 hours ago

Apply now

About the role

  • MLOps Engineer working on ML processes and robust workflows at Kensho. Collaborating with engineers to enhance tooling, services, and frameworks for machine learning.

Responsibilities

  • Iterate on Kensho’s ML processes to develop tools, services, and frameworks that make every stage of the ML workflow robust, auditable, and usable.
  • Work closely with ML engineers to understand their unique processes, identify pain points, and form effective solutions.
  • Empower engineers with the stable tooling necessary to rapidly experiment and actualize their research into demonstrable prototypes and mature products
  • Provide resources and training for ML teams on best practices, enabling them to efficiently productionize their work to be leveraged by high-value products and services
  • Evaluate, select and champion open source and third-party solutions, driving their adoption across teams and integrating into Kensho’s existing platform ecosystem
  • Ship scalable, efficient, and automated processes for model fine-tuning and reinforcement learning and for the evaluation of LLMs/Agents
  • Improve LLM and Agentic observability to help monitor agentic applications in production, detecting performance, decay and drift issues
  • Stay at the frontier by actively tracking emerging tools and frameworks, promote best practices and strengthen the technical expertise of the team with your unique skill set

Requirements

  • 2+ years of experience in ML infra, ML Ops, ML Engineering or some similar skillset
  • Experience managing distributed systems with Kubernetes.
  • Cloud Platform (AWS) understanding. We utilize tools like EKS and managed ML services like Bedrock and SageMaker
  • Python proficiency (we are a python shop mostly)
  • Familiarity with distributed computing frameworks and workflow orchestration (ie. Ray, Airflow)
  • Familiarity with software engineering best practices in an ML context
  • Some basic understanding of ML concepts, LLMs and agents
  • Ability to debug distributed systems across infrastructure, networking and application layers
  • Excellent communication skills to drive adoption of new tools and best practices across multiple teams
  • Someone who’s very curious, driven, low-ego and eager to learn across a range of engineering disciplines, while being part of a fantastic team.

Benefits

  • Medical, Dental, and Vision insurance 100% company paid premiums
  • Unlimited Paid Time Off
  • 26 weeks of 100% paid Parental Leave (paternity and maternity)
  • 401(k) plan with 6% employer matching
  • Generous company matching on donations to non-profit charities
  • Up to $20,000 tuition assistance toward degree programs, plus up to $4,000/year for ongoing professional education such as industry conferences
  • Plentiful snacks, drinks, and regularly catered lunches
  • Dog-friendly office (CAM office)
  • Bike sharing program memberships
  • Compassion leave and elder care leave
  • Mentoring and additional learning opportunities
  • Opportunity to expand professional network and participate in conferences and events

Job title

Machine Learning Operations Engineer II

Job type

Experience level

JuniorMid level

Salary

$130,000 - $175,000 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job