About the role

Deploy and monitor ML/LLM pipelines for Fortune Global 500 clients. Work on generative AI and model parallelism at a leading EU quantum software company.

Responsibilities

Deploy cutting-edge ML/LLMs models to Fortune Global 500 clients
Design, develop, and implement ML and LLM pipelines including data acquisition, preprocessing, model training, tuning, deployment, and monitoring
Employ automation tools such as GitOps, CI/CD pipelines, Docker, and Kubernetes to enhance ML/LLM lifecycle
Establish and maintain monitoring and alerting systems to track LLM performance and detect data drift
Conduct truth analysis to evaluate LLM outputs against accurate data
Collaborate with Product, DevOps teams, and Generative AI researchers to optimize model performance and resource utilization
Manage and maintain cloud infrastructure (AWS, Azure) for LLM workloads ensuring cost-efficiency and scalability
Stay updated with ML/LLM Ops developments and integrate advancements into generative AI platforms
Communicate LLM performance and status to technical and non-technical stakeholders

Bachelor's or master's degree in computer science, Engineering, or a related field
Mid or Senior: 4+ years of experience as an ML/LLM engineer in public cloud platforms
Proven experience in MLOps, LLMOps, or related roles managing ML/LLM pipelines from development to deployment and monitoring
Expertise in cloud platforms (e.g., AWS, Azure) for ML workloads, MLOps, DevOps, or Data Engineering
Expertise in model parallelism in model training and serving, and data parallelism/hyperparameter tuning
Proficiency in Python
Experience with distributed computing tools such as Ray
Experience with model parallelism frameworks such as DeepSpeed, Fully Sharded Data Parallel (FSDP), or Megatron LM
Expertise with generative AI applications (content creation, data augmentation, style transfer)
Strong understanding of Generative AI architectures and methods (chunking, vectorization, context-based retrieval/search)
Experience working with Large Language Models like OpenAI GPT-3.5/4.0, Llama2, Llama3, Mistral
Experience with Azure Machine Learning, Azure Kubernetes Service, Azure CycleCloud, Azure Managed Lustre
Excellent English; Spanish is a plus
Great communication skills and ability to work collaboratively in an international environment
Preferred: experience training Mixture-of-Experts, working with multiple public clouds/hybrid environments, real-time streaming, training/inference optimization, LLM observability and API management