Hybrid Senior Systems Engineer, Artificial Intelligence Operations

Posted 1 hour ago

Apply now

About the role

  • Senior Systems Engineer at NVIDIA focused on improving AI cluster resiliency and delivering AIOps solutions. Collaborating with team members to debug complex issues and enhance customer satisfaction.

Responsibilities

  • Bring together and understand internal and external customer requirements to improve AI cluster resiliency and design AIOps-based solutions that address these needs
  • Develop automated workflows for issue detection and root cause analysis and closely collaborate with operators to debug sophisticated, full-stack AI cluster problems
  • Deliver compelling technical presentations and lead hands-on demos or training
  • Handle evaluation deployments (POC/POV) and ensure smooth, reliable installations by staying engaged throughout the customer journey

Requirements

  • Bachelor of Science or equivalent experience
  • 8+ years of networking experience in enterprise or service provider environments, with strong hands-on expertise in routing and switching
  • Proficient in scripting and automation using Python or similar languages, with strong Linux expertise
  • Proven experience working directly with customers to resolve issues and ensure success in Systems Engineer or SRE roles
  • Exceptional oral, written, and presentation skills for clearly communicating complex technical topics
  • Demonstrated ability to collaborate effectively across teams, partnering with operations, engineering, and product development

Benefits

  • Equity
  • Benefits

Job title

Senior Systems Engineer, Artificial Intelligence Operations

Job type

Experience level

Senior

Salary

$176,000 - $333,500 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job