AI Trace Generation Engineer designing and implementing trace collection systems for LLM workloads. Analyzing distributed AI workload behavior across multi-GPU and multi-node setups.
Responsibilities
Design and implement a trace collection system for distributed LLM workloads
Validate that collected traces accurately reflect real workload behavior
Integrate with and instrument major LLM frameworks to extract meaningful execution data
Use collected traces as input to discrete event simulations
Analyze trace data to surface bottlenecks and inefficiencies across the stack
Requirements
3+ years of experience in AI systems, ML infrastructure, or a closely related area
Hands-on experience with at least one major LLM serving or training framework
Strong proficiency in Python and C++
Solid understanding of GPU architecture, memory bandwidth, and the difference between compute-bound and memory-bound operations
Solid understanding of distributed communication
Familiarity with parallelism strategies and how they shape execution behavior across large clusters
Open source contributions or published research in relevant areas will definitely be appreciated
Previous startup experience is a plus
Benefits
Competitive compensation with a performance-based incentive
Subsidized Deutschlandticket
Access to a discount portal
Flexible hours with hybrid and remote-friendly options
Researcher developing AI Literacy resources for Digital Promise Global; collaborating with educators and tech partners to foster equitable implementation.
AI Marketing & Sales Automation intern at ManoMotion working with cutting - edge AI products globally. Supporting automated sales pipelines and creating AI - driven marketing content.
AI Technology Lead at Visma establishing AI - native practices in software development. Collaborating with teams to explore and implement innovative AI technologies for better software delivery performance.
AI Technology Lead role at Visma Labs focusing on innovative AI - driven software development with a multicultural team. Leading hands - on engineering practices for scalable software delivery.
Principal Data Consultant leading advanced analytics and AI solutions for enterprise environments. Guiding clients through strategy, architecture, and AI adoption with a focus on measurable business value.
Sales Specialist focused on Industry HPC and AI solutions for customers at Hewlett Packard Enterprise. Liaising with diverse personas to drive sales and proposals in a hybrid working environment.
Head of Applied AI & Products leading the design and realization of AI - native product ecosystem at Viridien. Defining technical direction for AI solutions in geoscience domain.
AI and Automation Intern assisting with AI Agents and automated workflows at Rexford Industrial. Collaborating with teams to improve business processes and efficiency with AI.
AI Operations and Automation Specialist at AnyVan enhancing efficient logistics through AI automation. Focused on optimizing customer experience and operational processes in a hybrid work environment.