Senior Software Engineer focusing on performance of LLMs and AI workloads on distributed infrastructure. Working with cutting-edge technologies and improving efficiency for enterprise applications.
Responsibilities
Add support for new LLMs, working across the stack from low-level GPU kernels to Kubernetes-based deployments.
Contribute to cutting-edge open-source LLM engines such as vLLM or SGLang to extend their capabilities and performance (e.g. use Python technologies to improve API servers or request schedulers).
Operate closer to the hardware, focusing on building and integrating solutions to boost performance and hardware utilization. For example, improve attention backends like FlashAttention or FlashInfer by contributing to their development and optimization, or by integrating their solutions into vLLM.
Improve LLM performance using advanced algorithmic solutions such as speculative decoding, quantization, or other state-of-the-art techniques. Understand the impact of such techniques in model quality.
Requirements
Expertise in GPU computing, including low-level platforms such as CUDA, ROCm, XLA, PyTorch, Jax, etc.
Background in performance analysis and optimization of AI/HPC workloads (e.g. profiling or theoretical analysis of Flops and bandwidth).
Experience in writing GPU kernels using technologies like CUDA, CUTLASS, Triton.
Strength in Python and C++.
Demonstrated contributions to open-source projects. Contributions to inference engines such as vLLM is a strong plus.
A production-oriented mindset emphasizing robust, scalable code suitable for enterprise-grade applications.
A relentless curiosity about cutting-edge AI technologies combined with a passion for solving complex problems.
Full Stack Developer integrating AI into web solutions at Hypernova Labs, enhancing tech innovation. Collaborating with multidisciplinary teams in scalable software development in Panama.
Senior Software Developer developing and maintaining RokDoc applications for Ikon Science. Collaborating with geoscience experts and agile teams across multiple locations.
IT Assistant providing technical support and developing software solutions for Marchi & Fildi. Seeking a collaborative individual with a passion for technology and user support.
Fullstack Developer at LeadTable working with modern tech stack to develop features for SaaS application. Collaborating closely with agile teams and ensuring high code quality.
Managing Director and Lead Product Engineer for EMS platform development in a new joint venture. Focused on software enhancement and strategic direction for energy management systems.
Senior Engineer II developing and modernizing internal tools for Strava's platform. Collaborating with teams to enhance developer efficiency and support systems.
Lead Software Engineer at HyperFi overseeing platform architecture and guiding engineering teams. Involves working closely with the CTO and shaping technical roadmaps.
Senior Software Engineer developing AI solutions for Exacaster utilizing Large Language Models. Building production - ready applications and collaborating with cross - functional teams in a hybrid work environment.
Full Stack Developer at Nuvei building high - performance web applications for payment processing solutions. Collaborating with cross - functional teams to implement advanced AI features and backend APIs.
Full Stack Developer developing scalable web applications for logistics platform Wedrop with a focus on organization and best practices. Autonomy in technical demands.