MLOps Engineer at a fast-growing Quantum Software company deploying ML/LLM models for Fortune Global 500 clients. Collaborating with experts to design, develop, and implement large-scale ML solutions.
Responsibilities
Deploy cutting-edge ML/LLMs models to Fortune Global 500 clients.
Join a world-class team of Quantum experts with an extensive track record in both academia and industry.
Collaborate with the founding team in a fast-paced startup environment.
Design, develop, and implement Machine Learning (ML) and Large Language Model (LLM) pipelines, encompassing data acquisition, preprocessing, model training and tuning, deployment, and monitoring.
Employ automation tools such as GitOps, CI/CD pipelines, and containerization technologies (Docker, Kubernetes) to enhance ML/LLM processes throughout the Large Language Model lifecycle.
Establish and maintain comprehensive monitoring and alerting systems to track Large Language Model performance, detect data drift, and monitor key metrics, proactively addressing any issues.
Conduct truth analysis to evaluate the accuracy and effectiveness of Large Language Model outputs against known, accurate data.
Collaborate closely with Product and DevOps teams and Generative AI researchers to optimize model performance and resource utilization.
Manage and maintain cloud infrastructure (e.g., AWS, Azure) for Large Language Model workloads, ensuring both cost-efficiency and scalability.
Stay updated with the latest developments in ML/LLM Ops, integrating these advancements into generative AI platforms and processes.
Communicate effectively with both technical and non-technical stakeholders, providing updates on Large Language Model performance and status.
Requirements
Bachelor's or master's degree in computer science, Engineering, or a related field.
Mid or Senior: 4+ years of experience as an ML/LLM engineer in public cloud platforms.
Proven experience in MLOps, LLMOps, or related roles, with hands-on experience in managing machine/deep learning and large language model pipelines from development to deployment and monitoring.
Expertise in cloud platforms (e.g., AWS, Azure) for ML workloads, MLOps, DevOps, or Data Engineering.
Expertise in model parallelism in model training and serving, and data parallelism/hyperparameter tuning.
Proficiency in programming languages such as Python, distributed computing tools such as Ray, model parallelism frameworks such as DeepSpeed, Fully Sharded Data Parallel (FSDP), or Megatron LM.
Expertise in generative AI applications and domains, including content creation, data augmentation, and style transfer.
Strong understanding of Generative AI architectures and methods, such as chunking, vectorization, context-based retrieval and search, and working with Large Language Models like OpenAI GPT 3.5/4.0, Llama2, Llama3, Mistral, etc.
Senior ML Engineer designing and developing machine learning models for national security. Collaborating with cross - functional teams to deliver scalable solutions in defense applications.
Machine Learning Engineer developing and deploying ML planning algorithms for autonomous trucks. Join Plus, a leader in AI - based virtual driver software for autonomous trucking.
Intern for Servo Engineering at Seagate, integrating AI/ML into precision servo design. Collaborating on research and optimization of control algorithms for hard disk systems.
Intern role focused on Machine Learning and Generative AI projects for Seagate's innovative data solutions. Contributing to precision - engineered storage initiatives in Singapore.
Senior ML Platform Engineer at GEICO focusing on building scalable machine learning infrastructure and managing AI applications. Responsible for design, implementation, and mentoring within the ML team.
Senior Staff Machine Learning Engineer developing and integrating ML systems for GEICO’s Claims organization. Collaborating on AI - powered capabilities to enhance decision - making and user experience.
Principal Machine Learning Engineer optimizing video recommendation systems for Snap. Collaborating with cross - functional teams to advance machine learning strategies and improve tech stack.
Machine Learning Engineering Manager at Snap Inc. leading engineering teams to develop models for value creation. Responsible for technical evaluations, product scalability, and engineering excellence.
Intern working on servo controller design and AI technologies for hard disk drives. Collaborating on projects involving cutting - edge control systems and presenting findings to engineering teams.
Senior/Principal Machine Learning Engineer designing ML systems for Workday’s AI agents. Overseeing full lifecycle from problem framing to deployment while collaborating with cross - functional teams.