Hybrid Senior Backend Engineer, Data Mining

Posted 2 months ago

Apply now

About the role

  • Senior Backend Engineer at Motional architecting high-throughput backend systems for AI-driven data mining of autonomous vehicles. Collaborate closely with ML engineers to optimize data processing and ensure reliability.

Responsibilities

  • Architect the OmniTag Engine: Design and build the high-throughput, low-latency backend systems that execute billion-scale inference across Ray/Spark, transforming raw sensor data into unified multimodal representations.
  • Scale Multimodal Data Pipelines: Own the complete data journey - from ingestion, normalization, and preprocessing of heterogeneous modalities (image, video, LiDAR, audio) through encoding, indexing, and cached embedding storage. Ensure pipelines are robust, observable, and meet the SLOs expected by downstream ML teams.
  • Evolve the Vector Search and Retrieval Engine: Enhance our in-house billion-scale vector search engine to power RAG-driven few-shot dataset creation. Optimize embedding storage, retrieval performance, and filtering across billions of examples to enable rapid interactive mining workflows.
  • Own Data Quality and Observability: Build comprehensive monitoring, logging, and alerting for multimodal data preprocessing pipelines. Develop data validation frameworks that catch regressions in data alignment, normalization, or encoding quality—critical for maintaining model performance.
  • Collaborate on Encoder-Decoder Adaptation: Work closely with ML engineers to support domain-specific fine-tuning workflows, model versioning, and A/B testing of new encoders and decoders. Ensure the backend infrastructure enables rapid experimentation with emerging open-source multimodal foundation models.
  • Drive Production Reliability: Establish patterns for graceful degradation, fault tolerance, and cost optimization. Operate OmniTag as a mission-critical data platform serving the entire ML organization, with a focus on reliability, debuggability, and operational excellence.

Requirements

  • BS in Computer Science or a related field, or equivalent professional experience
  • 6+ years designing, building, and operating large-scale distributed systems in production environments
  • Deep, hands-on expertise with Ray or Spark (or both) for distributed data processing and large-scale inference workloads
  • Expert-level Python proficiency with strong software engineering fundamentals: testing (unit, integration, and end-to-end), CI/CD pipelines, containerization, and code review practices
  • Proven experience optimizing and scaling production data pipelines that process terabytes or petabytes of data
  • Strong SQL and data manipulation skills; comfort with both structured and semi-structured data
  • Experience with cloud infrastructure (AWS preferred: S3, EC2, EKS, EMR, IAM) and infrastructure-as-code patterns
  • Demonstrated track record of shipping robust, well-tested, production-grade systems and mentoring junior engineers

Benefits

  • medical
  • dental
  • vision
  • 401k with a company match
  • health saving accounts
  • life insurance
  • pet insurance
  • additional forms of compensation such as a bonus or company equity

Job title

Senior Backend Engineer, Data Mining

Job type

Experience level

Senior

Salary

$159,000 - $207,000 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job