Senior AI Software Engineer, LLM Inference Performance Analysis at NVIDIA | Hybrid Hired

About the role

Analyze the performance of LLMs on NVIDIA GPUs by employing advanced profiling and projection tools.
Find opportunities for performance improvements in the IR-based compiler middle end optimizer and/or in precompiled kernel optimizations driven by Graph IR transformations.
Build and develop new compiler passes and optimization techniques to deliver outstanding, robust, and maintainable compiler infrastructure and tools.
Collaborate closely with architecture teams to influence and co-design future hardware features that improve compiler and runtime efficiency.
Work with geographically distributed teams across compiler, hardware, kernel, and framework domains to drive performance improvements and resolve complex issues.
Contribute to a core team at the forefront of deep learning and LLM inference technology, spanning hardware architecture development, kernel optimization, and integration with higher-level deep learning frameworks.

Requirements

Master’s or PhD in Computer Science, Computer Engineering, or a related field, or equivalent experience.
5+ years relevant experience.
Strong hands-on programming expertise in C++ and Python, with solid software engineering fundamentals.
Skilled in innovative LLM architectures, covering inference optimization, profiling, and compiler-level performance tuning.
Significant background in optimizing kernels through information retrieval techniques and generating code, including graph transformations, fusion, scheduling, and developing custom kernel generation frameworks like OpenAI Triton or other compiler-based code generation pipelines.
Hands-on experience with deep learning frameworks like TensorRT-LLM, vLLM, SGLang, Jax/XLA, or related compiler/runtime environments.
Proven ability to analyze and optimize LLM performance bottlenecks across model development, kernel execution, and runtime systems.
Excellent communication and collaboration skills, with the ability to work independently and effectively across distributed teams in a fast-paced environment.
Display a robust determination to continuously improve software and hardware performance by engaging in profiling, analysis, and optimization.
Proficiency in CUDA programming and familiarity with GPU-accelerated deep learning frameworks and performance tuning techniques.

Benefits

equity
benefits

Similar roles

Browse all Full Stack Engineer jobs

4 hours ago

II

Inmar IntelligenceSoftware Engineer

Software Engineer responsible for full - stack development using modern frameworks and cloud solutions in U.S. locations. Collaborative role focusing on delivering quality software and technical mentorship.

Hybrid Role

Winston Salem United States Full Stack Engineer

$91,364 - $152,274 per year

5 hours ago

AE

Arrow ElectronicsSenior .NET Developer

Senior Engineer developing and maintaining .NET applications at a leading technology firm in India. Collaborating on system architecture, writing APIs, and utilizing cloud technologies.

Onsite Role

Ahmedabad India Full Stack Engineer

5 hours ago

US

USAAMid Level Software Engineer – Oracle Cloud Apps

Mid Level Software Engineer developing Oracle Cloud Apps for USAA's CFO team. Collaborating with finance and IT for improving financial processes.

Hybrid Role

San Antonio United States Full Stack Engineer

$93,770 - $179,240 per year

6 hours ago

DC

Duke Energy CorporationSenior Engineer – Electrical, PMC

Experienced Engineer supporting all phases of major projects in power generation, focusing on project development, compliance, and oversight within Duke Energy.

Hybrid Role

Cayuga United States Full Stack Engineer

6 hours ago

MI

MiTekSoftware Engineer

Software Engineer designing, building, and maintaining software products at MiTek. Join a collaborative team focused on innovation and delivering meaningful solutions.

Hybrid Role

Greenwood Village United States Full Stack Engineer

$95,000 - $140,000 per year

6 hours ago

MP

Malvern PanalyticalSoftware Engineer – Graduate

Graduate Software Engineer developing next - generation thermal instrumentation. Collaborating in a multi - disciplinary team at Malvern Panalytical.

Hybrid Role

Northampton United States Full Stack Engineer

$75,000 - $85,000 per year

6 hours ago

LF

LPL FinancialPrincipal Software Developer

Principal Software Developer leading design and development of account management applications at LPL Financial. Collaborating with cross - functional teams and implementing AI solutions for enhanced client onboarding.

Hybrid Role

Austin United States Full Stack Engineer

$155,288 - $258,813 per year

6 hours ago

PA

PapersFull-Stack Engineer

Full Stack Engineer developing web3 products for Papers AG. Working across the stack, collaborating with customers to create impactful solutions.

Hybrid Role

Switzerland Full Stack Engineer

7 hours ago

WC

West-Island ConseilsSenior Développeur Full Stack – applicatif, infra

Senior Full Stack Developer for a tech advisory and software development company. Responsibilities include mentoring, creating web apps, and enhancing client interactions in an agile environment.

Hybrid Role

Montreal Canada Full Stack Engineer

CA$120,000 - CA$150,000 per year

7 hours ago

EC

EchodynePrincipal Software Engineer, Embedded

Principal Software Engineer at Echodyne developing radar equipment subsystems with an experienced team. Engaging in high - performance software solutions for industry - leading radar technology.

Hybrid Role

Kirkland United States Full Stack Engineer

$145,300 - $217,900 per year