Hybrid Software Engineer – Model Performance

Posted 2 hours ago

Apply now

About the role

  • Software Engineer focusing on ML performance at Baseten, driving optimizations for large language models. Join a dynamic team contributing to advanced AI applications.

Responsibilities

  • Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.
  • Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues.
  • Apply and scale optimization techniques across a wide range of ML models, particularly large language models.
  • Collaborate with a diverse team to design and implement innovative solutions.
  • Own projects from idea to production.

Requirements

  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
  • Experience with one or more general-purpose programming languages, such as Python or C++.
  • Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).
  • Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
  • Demonstrated interest and experience in LLM’s.
  • Deep understanding of GPU architecture.
  • Proficiency in enhancing the performance of software systems, particularly in the context of large language models (LLMs) (Bonus).
  • Experience with CUDA or similar technologies (Bonus).
  • Deep understanding of software engineering principles and a proven track record of developing and deploying AI/ML inference solutions (Bonus).
  • Experience with Docker and Kubernetes (Bonus).

Benefits

  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Job title

Software Engineer – Model Performance

Job type

Experience level

Mid levelSenior

Salary

$180,000 - $360,000 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job