AI Inference Engineer developing AI model optimizations for Quadric's GPNPU platforms. Porting and benchmarking AI models to enhance performance in edge devices.
Responsibilities
Quantize, prune and convert models for deployment
Port models to Quadric platform using Quadric toolchain
Optimize inference deployment for latency, speed
Benchmark and profile model performance and accuracy
Develop tools to scale and speed up the deployment
Make Improvement to SDK and runtime
Provide technical support and documents to customers and developer community
Requirements
Bachelor’s or Master’s in Computer Science and/or Electric Engineering.
5+ years of experience in AI/LLM model inference and deployment frameworks/tools
experience with model quantization (PTQ, QAT) and tools
experience with model accuracy measures
experience with model inference performance profiling
experience with at least one of the following frameworks: onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp
Proficiency in C/C++ and Python
Demonstrate good capability in problem solving, debug and communication
AI Prompt Engineer focusing on developing conversational AI experiences for healthcare professionals at Elsevier. Join a team creating innovative solutions powered by generative AI.
Junior AI Videographer creating engaging AI - driven video and visual content for a multi - asset broker. Collaborating on marketing campaigns and digital storytelling.
Technology Consultant role with Avanade focusing on IT and digital solutions after completing a foundational training program. Join a community passionate about technology and innovation.
Manager in Data & AI for Defense at Atos, responsible for structuring AI consulting practice. Leading projects related to AI sovereignty and resilience for defense and aerospace sectors.
Applied Researcher leveraging AI technologies to enhance customer interactions at Capital One. Collaborating with experts to build, evaluate, and implement advanced AI models across financial services.
Applied Researcher I at Capital One driving AI innovations for banking. Collaborating with cross - functional teams to develop AI - powered products and enhance customer experiences.
Applied Researcher I utilizing AI foundations to enhance customer banking experiences at Capital One. Collaborating with cross - functional teams to build and implement innovative AI - powered solutions for improved interactions.
Industrial AI Applications Lead driving the development and deployment of AI solutions in industrial environments. Collaborating with customers and teams for practical, scalable, and impactful results.