Software Engineer focused on Model APIs, ensuring performance for AI models at Baseten. Collaborating with product and infrastructure teams to streamline developer interaction with AI models.
Responsibilities
Design, build, and operate the Model APIs surface with focus on advanced inference capabilities: structured outputs (JSON mode, grammar-constrained generation), tool/function calling and multi-modal serving
Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, implement custom CUDA operators, tune memory allocation patterns for maximum throughput and optimize communication patterns across multi-GPU setups
Productionize performance improvements across runtimes with deep understanding of their internals: speculative decoding implementations, guided generation for structured outputs, custom scheduling and routing algorithms for high-performance serving
Build comprehensive benchmarking frameworks that measure real-world performance across different model architectures, batch sizes, sequence lengths, and hardware configurations
Productionize performance improvements across runtimes (e.g.TensorRT, TensorRT‑LLM): speculative decoding, quantization, batching, and KV‑cache reuse.
Instrument deep observability (metrics, traces, logs) and build repeatable benchmarks to measure speed, reliability, and quality.
Implement platform fundamentals: API versioning, validation, usage metering, quotas, and authentication.
Collaborate closely with other teams to deliver robust, developer‑friendly model serving experiences.
Requirements
3+ years experience building and operating distributed systems or large‑scale APIs.
Proven track record of owning low‑latency, reliable backend services (rate‑limiting, auth, quotas, metering, migrations).
Infra instincts with performance sensibilities: profiling, tracing, capacity planning, and SLO management.
Comfortable debugging complex systems, from runtime internals to GPU execution traces.
Strong written communication; able to produce clear design docs and collaborate across functions.
Benefits
Competitive compensation package.
This is a unique opportunity to be part of a rapidly growing startup in one of the most exciting engineering fields of our era.
An inclusive and supportive work culture that fosters learning and growth.
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
C# Software Engineer developing innovative software for precision machines at EVG. Involved in all phases from concept to deployment in a collaborative team environment.
Software Developer for Test Automation involved in developing automated test solutions for web - based .NET applications. Collaborate within an interdisciplinary Scrum team to optimize software quality.
Senior Software Engineer at OQC leading the compiler development effort for quantum computing. Designing scalable software solutions while mentoring engineers and driving technical excellence.
Senior Full Stack Developer at desk:box developing features from architecture to deployment on AWS for an innovative application in agriculture and SMEs.
Senior Software Engineer leading design and implementation of sensor calibration algorithms for autonomous vehicles. Collaborating within a hybrid model at Toyota's Ann Arbor location.
Controls Engineer responsible for system maintenance in Australian Data Centers. Leading projects and collaborating across teams to optimize operations and ensure adherence to standards.
Senior Software Engineer developing robust software solutions for AI tactics in Defense Metaverse. Collaborating with teams using C++ and Machine Learning technologies in a hybrid environment.
As a Senior Software Engineer, you'll design and operate software solutions for manufacturing processes at GROPYUS. You will play a key role in merging digital and physical logistics systems.
Software Engineer IV delivering complex software solutions for Truist through analysis, design, and coding. Leading development efforts and mentoring teammates while adhering to standards in an Agile environment.
Software Engineer IV developing nuclear energy solutions for Framatome. Engaging in innovative projects to enhance clean energy technologies across North America.