ML Infrastructure Engineer at ChipStack responsible for building training pipelines for LLMs. Collaborating with chip designers and software engineers in a fast-moving startup environment.
Responsibilities
Build the core infrastructure that enables training, fine-tuning, evaluation, and deployment of LLMs across cloud and on-premise environments
Work alongside highly experienced chip designers, ML scientists, and other top-notch engineers
Contribute to solving some of the hardest problems in chip design
Requirements
5+ years of experience in ML infrastructure or adjacent roles
Deep expertise in Python and experience with training frameworks like PyTorch or TensorFlow
Strong systems engineering skills and experience with distributed training, data pipelines, and performance optimization
Experience deploying ML models to production (REST APIs, batch jobs, streaming pipelines)
Proficiency with cloud platforms (e.g., GCP, AWS) and containerized systems (Docker, Kubernetes)
Experience managing GPU/TPU workloads efficiently
Good communication skills and the ability to work directly with engineers and customers
Prior experience training or fine-tuning LLMs
Experience setting up observability, monitoring, and evaluation pipelines for ML models
Machine Learning Engineer developing advanced ML - driven applications to enhance quantum technologies. Collaborating with teams to translate complex physical data into actionable improvements.
Lead Machine Learning Engineer at Disney applying AI and machine learning to enhance advertising capabilities. Collaborating with teams to build robust ML systems and drive innovation.
Senior Machine Learning Scientist improving customer and business outcomes using ML and statistical modeling. Working with experienced team and involved in end - to - end model development.
Senior AI/ML Ops Engineer at Smartsheet responsible for building scalable AI/ML platforms. Collaborating with cross - functional teams to enhance data infrastructure and operational efficiency.
Machine Learning Engineer developing LLM - powered systems at Trainline. Designing predictive ML systems, collaborating with cross - functional teams on AI initiatives.
Staff ML Engineer building scalable platforms for ML model training and evaluation at GM. Collaborating on autonomous driving technology development and mentoring junior engineers.
Machine Learning Software Engineer developing and industrialising AI solutions for Tech Soft 3D's HOOPS AI product. Collaborating on core libraries and APIs for industrial 3D applications.
AI and ML Engineer deploying machine learning solutions for national security. Collaborating with engineers and scientists to deliver data processing solutions at scale.