Customer Engineer managing post-sales technical relationships maintaining AI workloads for customers. Collaborating with teams to resolve issues and enhance product performance for enterprise clients.
Responsibilities
Serve as the first responder to all post-sales customer issues via ticketing (Pylon) and Slack, triaging and resolving Tier 1 and Tier 2 issues independently.
Diagnose runtime issues related to latency, memory behavior, GPU utilization, concurrency, and model lifecycle management.
Debug infrastructure problems across Kubernetes (pods, controllers), networking, observability, and alerting systems.
Pull logs, read error traces, and correlate signals across Grafana, Loki, and Prometheus to pinpoint root causes — even when the real issue is buried layers deep.
Lead incident response during outages and escalations, coordinating across Product, SRE, Sales, and Engineering.
Own customer communication through resolution — even when the fix is handed off to SRE or Infra — including delivering root-cause analyses after every P0/P1.
Escalate to SRE/ other engineering teams with structured context (customer, affected models, what you've already ruled out, specific ask) so nothing gets lost in translation.
Drive post-incident alerting reviews: why did the customer find this before we did, and what instrumentation or process change prevents it next time?
Serve as the technical owner for top enterprise accounts with strict SLAs and high responsiveness expectations.
Set up and maintain proactive monitoring and alerts for all customer production models within 24 hours of handoff from SA(Solution Architect).
Drive the QBR process and proactive reengagement for expansion opportunities.
Track recurring failure patterns across accounts and push for durable fixes — not just incident closure.
Monitor internal feedback channels and route product-level issues to the right teams.
Own the SA-to-CE handoff for new customers: validate architecture, confirm production-readiness milestones, and establish escalation paths.
Maintain and improve runbooks, knowledge bases, and diagnostic best practices so the team scales with the customer base.
Translate user feedback into roadmap signals, documentation improvements, and product enhancements.
Coordinate end-to-end on projects spanning feature requests, new deployments, and operational debugging — scoping, execution, communication, and stakeholder alignment.
Requirements
Deep Kubernetes troubleshooting expertise, including resource debugging, pod/runtime analysis, and log-based diagnostics with observability tooling (Grafana, Loki, Prometheus).
Strong infrastructure debugging across container orchestration, networking, and service dependencies, with hands-on production cluster experience.
Experience managing high-severity incidents with major customers — SLAs, war rooms, post-incident reviews, and clear executive-level communication throughout.
Proven project management skills with an ownership mindset: you can run multiple complex, multi-stakeholder initiatives in parallel without dropping threads.
Ability to translate recurring technical pain points into roadmap-level insights and product improvements.
Strong communication skills and executive presence during high-visibility situations, ensuring both technical clarity and customer confidence.
3+ years of experience in a fast-paced, high-growth, or customer-facing engineering environment.
Benefits
Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
SME Cross Domain Implementation Engineer Lead working with Department of War to enhance data and analytics capabilities across DoD. Designing secure data transfer solutions and leading technical teams for mission - critical projects.
Implementation Manager overseeing onboarding lifecycle for SMB waste haulers at CurbWaste. Managing customer relationships, data migrations, and cross - functional coordination throughout the process.
Senior Advanced Customer Engineer providing mentorship and technical support to field engineers at Cognex. Responsible for project delivery, documentation, and customer communication.
Implementation Engineer translating global packaging regulations into machine - readable assessment logic for sustainability. Collaborating with a skilled team in a GreenTech startup.
Implementation Specialist onboarding clients during initial implementation of Axxess products and services. Collaborating with teams to ensure smooth client transitions at Axxess.
Client Implementation Manager leading complex financial services technology projects for State Street. Overseeing implementation programmes and collaborating with cross - divisional teams in London.
Implementation Consultant for Next Generation, Inc. supporting Ceridian DayForce deployments across clients. Responsible for managing the implementation lifecycle and ensuring client satisfaction.
MES Implementation Specialist at Leonardo UK supporting Siemens Opcenter deployment in digital manufacturing transformation. Collaborating with cross - functional teams and external partners in a multi - vendor environment.
HRnet Implementation Consultant optimizing HR operations through HRnet application for clients. Advising on HR processes and legislation while guiding clients to best practices.
Implementation Consultant delivering EHS and sustainability software solutions for better workplace safety. Requires proficiency in Norwegian and English to collaborate with clients across multiple regions.