Senior Software Engineer focusing on performance of LLMs and AI workloads on distributed infrastructure. Working with cutting-edge technologies and improving efficiency for enterprise applications.
Responsibilities
Add support for new LLMs, working across the stack from low-level GPU kernels to Kubernetes-based deployments.
Contribute to cutting-edge open-source LLM engines such as vLLM or SGLang to extend their capabilities and performance (e.g. use Python technologies to improve API servers or request schedulers).
Operate closer to the hardware, focusing on building and integrating solutions to boost performance and hardware utilization. For example, improve attention backends like FlashAttention or FlashInfer by contributing to their development and optimization, or by integrating their solutions into vLLM.
Improve LLM performance using advanced algorithmic solutions such as speculative decoding, quantization, or other state-of-the-art techniques. Understand the impact of such techniques in model quality.
Requirements
Expertise in GPU computing, including low-level platforms such as CUDA, ROCm, XLA, PyTorch, Jax, etc.
Background in performance analysis and optimization of AI/HPC workloads (e.g. profiling or theoretical analysis of Flops and bandwidth).
Experience in writing GPU kernels using technologies like CUDA, CUTLASS, Triton.
Strength in Python and C++.
Demonstrated contributions to open-source projects. Contributions to inference engines such as vLLM is a strong plus.
A production-oriented mindset emphasizing robust, scalable code suitable for enterprise-grade applications.
A relentless curiosity about cutting-edge AI technologies combined with a passion for solving complex problems.
Software Engineer II developing RESTful APIs and contributing to AI workflows at Euromonitor. Collaborating closely with frontend and platform teams under Lead AI Engineers.
Senior Engineer with Fugro supporting metocean projects in dynamic offshore environments. Involves project support from office planning to offshore execution with a focus on quality and safety.
Senior Oceanographic Engineer leading offshore projects at Fugro's Regional Coastal Monitoring project team. Overseeing surveys, mentoring, and ensuring operational excellence in metocean services.
Softwareentwickler developing complex applications with technical expertise for STEPplay and STEPbasic. Analyzing customer requirements and creating detailed specifications for development processes.
Senior Full - Stack Software Developer developing a platform for sustainable biocompounds at Mycolever. Collaborating with scientists to enhance biocompound discovery through software innovation.
Software Engineer responsible for developing cloud based services for AI solutions in legal tech. Collaborating with teams to integrate AI systems and innovate processes.
Intern managing customer support team operations in AI - powered tech company. Collaborating on business scaling efforts and learning about Software, AI and B2B.
Full Stack Developer at Liftric GmbH developing AI - based software solutions in the medical field. Collaborating with product management, design, and QA on web applications.
Linux kernel developer for Mobileye, responsible for developing embedded software. Involves researching and implementing drivers, integrating SW solutions in automotive platforms.
Linux Kernel Developer focused on developing embedded SW products for Mobileye's Autonomous Driving technologies. Innovating with highly talented engineers on cutting - edge solutions in the automotive sector.