AI Inference Engineer – Model Optimization, Deployment at Zoox | Hybrid Hired

About the role

Model Optimization & Deployment Engineer optimizing large-scale ML models for Zoox's autonomous vehicle technology. Focused on deployment for efficient real-time execution in vehicles.

Responsibilities

Optimize large-scale models (LLMs, VLMs) using advanced quantization (PTQ, QAT), mixed-precision inference workflows, and parameter-efficient fine-tuning (LoRA, QLoRA).
Architect and implement model conversion and compilation pipelines using TensorRT and TensorRT-LLM for edge deployment.
Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
Write and optimize custom CUDA kernels and TensorRT Plugins to maximize memory bandwidth and minimize latency on AI accelerators.
Write production-level, highly concurrent, and memory-safe C++ and Python code for real-time inference on vehicle SOCs.

Requirements

Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference workflows (INT8, FP8, INT4, BF16/FP16).
Proven experience optimizing large-scale models (LLMs, VLMs, or VLAs) utilizing KV-cache optimization (e.g., PagedAttention), Speculative Decoding, and Efficient Attention mechanisms (FlashAttention, Linear Attention).
Extensive experience with model conversion/compilation pipelines (TensorRT, TensorRT-LLM) and performing rigorous parity/latency benchmarking.
Proficiency in low-level programming for AI accelerators, specifically writing and optimizing custom CUDA kernels and TensorRT Plugins.
Production-level C++ (14/17/20) and Python programming skills, with experience writing concurrent, memory-safe, real-time inference code for edge devices.

Benefits

Paid time off (e.g. sick leave, vacation, bereavement)
Unpaid time off
Zoox Stock Appreciation Rights
Amazon RSUs
Health insurance
Long-term care insurance
Long-term and short-term disability insurance
Life insurance

Similar roles

Browse all Artificial Intelligence jobs

53 minutes ago

PR

AI-First Technical Associate

Progrits

AI - First Technical Associate at Progrits supporting the CTO and building tech solutions. Engaging in technical problem - solving and collaborating in a dynamic environment.

Hybrid Role

Göteborg Sweden Artificial Intelligence

16 hours ago

UP

Applied AI

Upvest

Applied AI Engineer at Upvest driving AI implementation to enhance business efficiency. Collaborating with teams to integrate AI solutions and optimize workflows across company operations.

Hybrid Role

Berlin Germany Artificial Intelligence

yesterday

<U

AI Analyst – Product

<Undefined>

AI Analyst contributing to cross - functional product teams at Personio, leveraging data insights and analytics for decision - making. Responsible for event tracking, impact measurement, and user behavior analysis.

Hybrid Role

Berlin Germany Artificial Intelligence

yesterday

AN

AI Safety Fellow

Anthropic

AI Fellows Program at Anthropic fostering research and engineering talent. Engaging in empirical AI projects with mentorship and resources.

Hybrid Role

San Francisco United States Artificial Intelligence

$3,850 per week

yesterday

PF

Senior Principal AI/ML Cheminformatics Scientist

Pfizer

AI Cheminformatics Scientist at Pfizer applying AI methods to drug discovery. Collaborating with medicinal chemists and biologists while developing cheminformatics workflows.

Hybrid Role

Groton United States Artificial Intelligence

$124,400 - $207,400 per year

yesterday

AN

AI & Digitalization Engineer — 20 hours/week

ANDRITZ

AI Engineer developing and deploying AI solutions for digital transformation at ANDRITZ. Focus on machine learning and modern AI technologies to create impactful systems.

Onsite Role

Graz Austria Artificial Intelligence

€3,396 per month

yesterday

BA

Business Strategy & Initiatives Manager – AI Driven Document Intelligence

Bank of America

Strategy Manager overseeing AI - driven document intelligence initiatives at Bank of America. Driving operational improvements through strategic partnerships and innovative document processing solutions.

Hybrid Role

New York City United States Artificial Intelligence

$115,000 - $158,000 per year

yesterday

MO

Junior AI Automation Manager

morefire

Junior AI Automation Manager developing AI workflows and automations at morefire GmbH. Collaborating on innovative projects with a focus on performance and creativity.

Hybrid Role

Köln Germany Artificial Intelligence

yesterday

FO

Junior AI Specialist

Fortinet

Junior AI Cybersecurity Specialist for FortiGuard IoC team using AI for threat detection. Designing ML models and developing AI solutions to combat cybersecurity threats.

Hybrid Role

Burnaby Canada Artificial Intelligence

CA$79,000 - CA$97,000 per year

2 days ago

WO

AI Agent Engineer

Workday

AI Engineer developing and operating AI products for Iyuno's projects with local teams, managing issues, and improving efficiency.

Hybrid Role

Seoul South Korea Artificial Intelligence