Senior Machine Learning Engineer developing scalable ML systems for AI applications at Chalice. Designing and deploying production ML models to impact advertising strategies and business outcomes.
Responsibilities
Architect, train, and maintain scalable neural network systems for audience modeling and bid optimization using PyTorch and Ray distributed training (Ray Train, Ray Tune, DDP)
Build and optimize multi-GPU training pipelines on Databricks, including hyperparameter search with ASHA scheduling and early stopping
Develop feature engineering pipelines using PySpark, including embedding layers (EmbeddingBag, Embedding) for categorical and behavioral features
Implement model comparison workflows with champion/challenger evaluation on holdout data
Build resilient training and batch inference workflows with a focus on automation, reproducibility, and checkpoint recovery
Implement robust model monitoring and observability solutions (MLflow, Prometheus, Grafana, Datadog) to track drift, performance metrics (AUC, AUPRC, F1), and system health
Manage model versioning, experiment tracking, and artifact persistence using MLflow and Unity Catalog
Work closely with engineering teams to integrate model outputs into production systems and optimize dataflows for fault-tolerance
Partner with product stakeholders to align ML efforts with business impact, KPIs, and product strategy across AI Audiences, AI Allocator, CPA Algo, and Curate AI
Lead technical design reviews, contribute to internal Python packages, and enforce engineering best practices (testing, CI/CD, modularity)
Stay current on ML infrastructure advancements (distributed training, inference optimization, model serving patterns) and help guide adoption internally
Document system architectures, create runbooks, and enable team members to adopt and extend the ML framework
Requirements
Master's Degree or PhD in Computer Science, Statistics, Machine Learning, or related discipline with 5-10 years of industry experience
Strong proficiency in PyTorch for neural network development, including custom architectures with embedding layers, MLP backbones, and binary classification heads
Production experience with Databricks including Delta Lake, Unity Catalog, Asset Bundles, and cluster management
Strong grasp of MLOps best practices: experiment tracking (MLflow), model versioning, model serving, monitoring, and reproducibility
Expert-level Python and PySpark skills for data processing and feature engineering at scale
Experience building and maintaining batch inference pipelines with schema versioning and artifact management
Familiarity with cloud platforms (AWS: S3, EC2) and data warehousing (Snowflake)
Experience with CI/CD workflows including build automation, testing, and packaging using GitHub Actions and Make
Excellent collaboration and communication skills; ability to work effectively in a cross-functional environment with DS, Product, and Engineering teams.
Benefits
Medical, Dental, and Vision coverage
401(k) options
Unlimited PTO
11 Company Holidays
Office-wide closure between Christmas Eve and New Year's
ML Engineering Lead at Saris AI tackling multi - modal AI systems in banking. Drive technical direction and build high - performing teams in an early - stage startup environment.
Machine Learning Engineer designing and training lightweight ASR models for mobile devices at Plaud. Contributing to optimization, multilingual data management, and deployment collaboration.
Machine Learning Engineer designing post - processing test suites for AI interaction systems at Plaud Inc. Collaborating on speech algorithm training and user experience optimization in San Francisco.
Intermediate AI/ML Engineer designing and deploying machine learning solutions for video security. Join Solink to transform video security into real - time operational insights in a hybrid work environment.
Senior Machine Learning Engineer for Toyota Connected developing state - of - the - art solutions for in - vehicle Voice Assistants. Collaborating with teams and mentoring junior members to drive innovation in machine learning technology.
MLOps Engineer leading large - scale model deployments and managing CI/CD pipelines in GCP ecosystem. Focus on operational excellence and implementing observability frameworks for AI systems.
Senior Machine Learning Engineer designing AI systems for multi - scale physical technologies at Orbital. Leading high - risk projects with a focus on AI research and engineering excellence.
Machine Learning Engineer at Auror, using data science to reduce retail crime through innovative ML systems. Collaborate with product teams and develop impactful solutions leveraging real - time data.
Master Thesis focusing on developing machine learning models for lithium - ion cell sorting at Fraunhofer LBF. Involvement in innovative projects addressing circular economy in battery recycling.