On-Device Machine Learning Engineer optimizing and deploying ML models for consumer hardware. Focused on latency, battery, and user experience with local, edge processing.
Responsibilities
On-device model optimization and deployment
Convert, optimize, and deploy models to run efficiently on-device using Core ML and/or MLX.
Implement quantization strategies (e.g., 8-bit / 4-bit where applicable), compression, pruning, distillation, and other techniques to meet performance targets.
Profile and improve model execution across compute backends (CPU/GPU/Neural Engine where relevant), and reduce memory footprint.
Local RAG + memory systems
Build and optimize local retrieval pipelines (embeddings, indexing, caching, ranking) that work offline and under tight resource constraints.
Implement local memory systems (short/long-term) with careful attention to privacy, durability, and performance.
Collaborate with product/design to translate “memory” behavior into concrete technical architectures and measurable quality targets.
Model lifecycle on consumer hardware
Own the on-device model lifecycle: packaging, versioning, updates, rollback strategies, on-device A/B testing approaches, telemetry, and quality monitoring.
Build robust evaluation and regression suites that reflect real device constraints and user workflows.
Ensure models degrade gracefully (low-power mode, thermals, backgrounding, OS interruptions).
Performance, reliability, and user experience
Treat battery, thermal, and latency as first-class product requirements: instrument, benchmark, and optimize continuously.
Design inference pipelines and scheduling strategies that respect app responsiveness, animations, and UI smoothness.
Partner with platform engineers to integrate ML into production apps with clean APIs and stable runtime behavior.
Requirements
Strong experience shipping ML features into production, ideally including mobile / edge / consumer devices.
Hands-on proficiency with Core ML and/or MLX, and the practical realities of running models locally.
Solid understanding of quantization and optimization techniques for inference (accuracy/perf tradeoffs, calibration, benchmarking).
Experience building or operating retrieval systems (embedding generation, vector search/indexing, caching strategies)—especially under resource constraints.
Fluency in performance engineering: profiling, latency breakdowns, memory analysis, and tuning on real devices.
Strong software engineering fundamentals: maintainable code, testing, CI, and debugging across complex systems.
Nice to Have:
• Experience with on-device LLMs, multimodal models, or real-time interactive ML features.
• Familiarity with Metal / GPU compute, or performance tuning of ML workloads on Apple platforms.
• Experience designing privacy-preserving personalization and memory (local-first data handling, encryption, retention policies).
• Experience building developer tooling for model packaging, benchmarking, and release management.
• Prior work on offline-first architectures, edge inference, or battery/thermal-aware scheduling.
Benefits
Competitive salary and performance-based incentives.
Comprehensive health, dental, and vision benefits package.
401k Match (US-based only)
$200/mos Health and Wellness Stipend
$400/year Continuing Education Credit
$500/year Function Health subscription (US-based only)
Senior AI/ML Engineer developing and deploying machine learning models for ADAS technology. Leading technical efforts and collaborating with diverse teams to enhance map content.
Machine Learning Engineer creating and maintaining ML models for intelligent automation and forecasting at Create Music Group. Collaborating with multiple teams to implement AI - driven solutions.
Director of ML Engineering at Cotality overseeing scaling of ML teams and enhancing Automated Valuation Models. Leading MLOps adoption and driving data strategy within the company.
AI Engineer developing cutting - edge AI models and frameworks in a hybrid setup at a tech startup. Collaborating closely with founders to shape the future of AI technology.
Lead ML Engineering at Track Titan, shaping AI coaching for motorsport with hands - on development in London or remote. Focus on building ML capabilities from a unique dataset.
Senior Machine Learning Engineer developing and optimizing computer vision models for enterprise AI solutions at ABBYY. Leading projects and collaborating across teams to drive innovation in machine learning technology.
Machine Learning Engineer at Junglee Games focusing on data science, Applied ML, and model production. Leverage expertise in Python, SQL, and PySpark for innovative gaming solutions.
Lead Software Engineer focusing on Machine Learning for innovative AI projects in clean energy. Spearheading technical direction and delivery while mentoring teams and managing high - risk projects.
Senior MLOps Engineer leading development and deployment of cutting - edge AI systems in Defence sector. Collaborating with diverse clients and mentoring junior engineers in robust ML environments.
AI/ML Engineer focusing on machine learning applications in oncology. Developing analytical solutions to enhance patient selection and drive next - generation assets in healthcare.