Hybrid Lead Software Engineer – AI Operations and Tooling

Posted 4 weeks ago

Apply now

About the role

  • Lead Engineer guiding AI Operations and Tooling for Disney's media technology. Focus on safe and efficient AI applications across major cloud platforms while mentoring teams.

Responsibilities

  • Define frameworks for AI-specific operations: hallucination/quality testing, evaluation pipelines, and continuous validation.
  • Establish reference patterns for scaling LLM services, prompt orchestration, and multi-agent workloads.
  • Build automation for safe rollout, monitoring, and incident response.
  • Implement end-to-end observability: latency, drift, failure modes, hallucination rates, and GPU/compute utilization.
  • Drive cost optimization and efficiency across AI cloud usage (AWS, Azure, GCP).
  • Define SLOs, dashboards, and runbooks for AI/LLM production systems.
  • Embed compliance, safety checks, and prompt-injection defenses into operational frameworks.
  • Mentor engineers in DevOps, infra, and AI operations.
  • Drive adoption of best practices for AI reliability, test automation, and incident management.
  • Collaborate across AI Core, Data Foundations, Security, and Product teams to ensure operational safety and scale.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related technical field (Master’s preferred), or equivalent experience.
  • 7+ years of experience in software engineering, DevOps, or infrastructure, with at least 2 years in a lead role.
  • Expert in at least one foundational language (Python, Java, or Go) with production-grade system experience.
  • Hands-on experience with cloud-native infrastructure (AWS preferred; Azure/GCP a plus) and modern orchestration platforms.
  • Proven experience with observability stacks (Datadog, Prometheus, Grafana) and incident response automation.
  • Familiarity with AI/LLM APIs (OpenAI, Anthropic, Bedrock, Azure AI Foundry) and orchestration frameworks (LangChain, LangGraph).
  • Strong knowledge of operational AI testing (A/B evaluation, regression, red-teaming) and guardrail enforcement.
  • Demonstrated ability to optimize cloud/GPU usage and manage costs at scale.
  • Excellent communication skills and proven ability to lead design reviews, mentor engineers, and influence cross-functional teams.

Benefits

  • Health insurance
  • 401(k) matching
  • Flexible work hours
  • Paid time off
  • Remote work options
  • Bonuses

Job title

Lead Software Engineer – AI Operations and Tooling

Job type

Experience level

Senior

Salary

$141,900 - $190,300 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job