ML Inference Router Engineer designing scalable inference systems at eBay. Aiming to support billions of daily requests with a focus on reliability and efficiency.
Responsibilities
Design and build an LLM inference gateway that scales to billions of daily requests with millisecond-level latency.
Develop intelligent request routing, load balancing, and fallback mechanisms across heterogeneous LLM backends (internal and external).
Optimize throughput, cost, and reliability of inference workloads in multi-tenant environments.
Collaborate with platform, research, and product teams to integrate new models and agentic capabilities into the gateway.
Implement observability, tracing, and autoscaling for inference traffic across Kubernetes-based clusters.
Conduct design and code reviews to ensure high standards in distributed systems architecture.
Stay current with advances in LLM serving, inference acceleration, and model APIs to continuously evolve the platform.
Requirements
10+ years of experience building large-scale, fault-tolerant, high-performance distributed systems.
Strong programming skills in one or more of Java, Go, Rust, or C++ (Java preferred for gateway services).
Deep understanding of networking, concurrency, memory management, and performance tuning in production systems.
Proven experience designing and operating low-latency APIs at very large scale (10M+ QPS).
Hands-on experience with Kubernetes, service meshes, and container orchestration at scale.
Strong background in cloud infrastructure (AWS, GCP, Azure) and distributed system design.
Benefits
full range of medical benefits
financial benefits
various paid time off benefits, such as PTO and parental leave
High Precision GNSS Algorithm Engineer designing and validating embedded positioning software at Septentrio. Collaborating with experts to develop a next - generation Positioning Engine for centimeter - level GNSS solutions.
Facilities and Environmental Health & Safety Engineer at Rubbercraft responsible for compliance with EHS regulations. Manage stormwater, wastewater reports, and coordinate safety programs.
Entry - level engineer joining Hazen and Sawyer to work on water treatment and management projects. Collaborating on engineering deliverables and participating in fieldwork and site visits.
Building Chief Engineer overseeing operations of a large school facility. Leading a team and managing mechanical engineering, HVAC systems, plumbing, and electrical systems compliance.
Site Engineer collaborating with the purchasing team at Ford to ensure quality and timely service. Building strong supplier relationships and optimizing costs in vehicle production.
Durability Engineer working with Ford on vehicle durability attributes from concept to production. Championing cross - functional teams to ensure vehicle robustness and reliability during testing.
Junior Engineer creating GCP data - driven solutions for enterprise data architecture. Collaborating with stakeholders on data needs and optimizing performance and reliability.
Gas Turbine Project Engineer in a team - oriented environment focusing on hardware & software design, implementation, testing, and customer support for gas turbine control systems.
Certification Engineer leading agency certifications and qualification testing for hazardous - location electrical products. Collaborating with agencies and engineering teams to ensure compliance and safety standards.