Senior Performance and Development Engineer at NVIDIA focusing on optimizing AI workloads and developing scalable AI infrastructure tools. Collaborating with a diverse team to enhance Deep Learning applications.
Responsibilities
Build AI models, tools and frameworks that provide real time application performance metrics that can be correlated with system metrics.
Develop automation frameworks that empower applications to thoughtfully predict and overcome system/infrastructure failures, ensuring fault tolerance.
Collaborate with software teams to pinpoint performance bottlenecks.
Design, prototype, and integrate solutions that deliver demonstrable performance gains in production environments.
Adapt and enhance communication libraries to seamlessly support innovative network topologies and system architectures.
Design or adapt optimized storage solutions to boost Deep Learning efficiency, resilience, and developer productivity.
Requirements
BS/MS/PhD (or equivalent experience) in Computer Science, Electrical Engineering or a related field.
12+ years of proven experience in analyzing and improving performance of training applications using PyTorch or similar framework.
Building distributed software applications using collective communication libraries such as MPI or NCCL or UCC.
Construct storage solutions for Deep Learning applications.
Building automated fault tolerant distributed applications.
Building tools for bottleneck analysis and automation of fault tolerance in distributed environments.
Strong background in parallel programming and distributed systems.
Experience analyzing and optimizing large scale distributed applications.
Excellent verbal and written communication skills.
Process Engineer at Vaisala's Instrument Factory in Finland, focusing on Lean manufacturing and continuous improvement in a high - tech production environment.
Associate Process Engineer leading process design for refinery and chemical facilities. Engaging in equipment analysis, design, and providing technical support for field execution.
Process Engineer leading process design efforts for refinery and chemical applications. Conducting technical evaluations, calculations, and providing technical support in a team setting.
Senior Manager leading initiatives in product quality & reliability for semiconductor equipment and components at Applied Materials. Responsible for developing methodologies and managing a team of engineers in Taiwan.
Project Engineer focusing on site design, CAD drafting, and energy forecasting for renewable energy projects. Collaborating closely with the engineering team to ensure optimized designs and compliance.
Senior Front - End Engineer responsible for scaling the front - end layer of a B2B SaaS platform used for live rail assistance. Collaborating with cross - functional teams to enhance usability and stability.
Building Science Engineer at Stantec supporting building envelope design and consulting for various projects. Collaborating with senior leaders on technical expertise and managing project responsibilities.
Intermediate Geotechnical Engineer at Stantec, leveraging innovation and technical expertise in Edmonton. Collaborating with clients and managing diverse geotechnical projects with a motivated team.