ML Infrastructure Engineer at ChipStack responsible for building training pipelines for LLMs. Collaborating with chip designers and software engineers in a fast-moving startup environment.
Responsibilities
Build the core infrastructure that enables training, fine-tuning, evaluation, and deployment of LLMs across cloud and on-premise environments
Work alongside highly experienced chip designers, ML scientists, and other top-notch engineers
Contribute to solving some of the hardest problems in chip design
Requirements
5+ years of experience in ML infrastructure or adjacent roles
Deep expertise in Python and experience with training frameworks like PyTorch or TensorFlow
Strong systems engineering skills and experience with distributed training, data pipelines, and performance optimization
Experience deploying ML models to production (REST APIs, batch jobs, streaming pipelines)
Proficiency with cloud platforms (e.g., GCP, AWS) and containerized systems (Docker, Kubernetes)
Experience managing GPU/TPU workloads efficiently
Good communication skills and the ability to work directly with engineers and customers
Prior experience training or fine-tuning LLMs
Experience setting up observability, monitoring, and evaluation pipelines for ML models
Senior Machine Learning Engineer at Itaú, driving innovation with data and AI solutions. Collaborating across teams to implement robust machine learning architectures and ensure scalable deployments.
Machine Learning Engineer responsible for developing and deploying advanced ML and AI solutions at Zendesk. Collaborating with stakeholders to deliver impactful business outcomes using latest machine learning technologies.
Lead advanced machine learning model development and optimization at PayPal. Collaborate with teams to deploy scalable ML solutions in production environments.
Senior Machine Learning Engineer at Pivotal Health developing ML systems for healthcare reimbursement. Collaborating across teams to build and maintain reliable, production - grade machine learning systems.
Machine Learning Engineer working with Algorithm team on customer onboarding processes. Focus on execution and automation of models using computer vision and AI in sports industry.
Senior Machine Learning Engineer at Troveo designing and optimizing machine learning pipelines for AI video models. Collaborating with cross - functional teams to build scalable video data solutions.
Software Engineer focusing on ML infrastructure for drug discovery at Genesis AI. Leading engineering efforts to enhance scalable platforms for generative modeling and large - scale simulations.
AI/ML Engineer developing machine learning systems for TymeX's digital banking platform. Collaborating across teams to enhance customer interaction and personalization through AI technology.