Perception Data Engineer at ANYbotics building data pipelines for perception models in mobile robotics. Collaborating within a global team on cutting-edge robotic technology.
Responsibilities
Build and operate the data plumbing that our perception models need: ingestion, versioned storage, ETL, labeling integration, and reliable production pipelines for training and inference.
Design, build and maintain scalable data pipelines and ETL workflows that ingest raw images, sensor metadata, and labels (both real and synthetic).
Implement dataset versioning, schema management, and reproducible data snapshots to support experiments and audits.
Integrate annotation tools (CVAT / Label Studio), manage labeling workflows and quality-control tooling, and support label QA processes.
Build data validation and monitoring checks (file integrity, label sanity, distribution drift alerts) and automate remediation where possible.
Provide clean, ready-to-use datasets and data loaders for ML engineers; optimize data access patterns for training (sharding, caching, prefetching).
Requirements
3+ years engineering experience building production data pipelines or ETL systems.
Strong Python scripting and engineering skills (pandas, pyarrow, boto3 or equivalent).
Experience with dataset versioning or large-file management (DVC, Git-LFS, or similar) and cloud object storage (S3).
Familiarity with annotation tooling and workflows for image data (CVAT / Label Studio).
Basic understanding of ML training data needs (batching, sharding, augmentation integration).
Prior work supporting computer-vision teams (image pipelines, preprocessing, TFRecord or custom dataset formats).
Data Engineer at Kyndryl designing and maintaining data pipelines using AWS and Python. Optimizing ingestion, transformation workflows, and cloud solutions for large - scale data environments.
Data Architect responsible for the integrity and reliability of Patient Services data in Life Sciences. Ensuring analytics - ready data through strategic vendor collaboration and data stewardship.
Project & Data Engineer providing operational support and data management for utility service projects in the Greater Los Angeles area. Involves invoice processing, data accuracy, and system coordination.
Senior Data Engineer developing scalable data architectures and integrating data ecosystems at Porto Bank. Ensuring data quality and effective pipeline development for various business teams.
Data Engineering Advisor designing data flow management systems to support advanced analytics at Desjardins Group. Collaborate with teams to enhance data value and transformation.
Founding Staff Data Engineer building and leading data engineering team for AI - driven art valuation platform. Establishing architecture and standards for data systems and pipelines.
Senior Data Engineer responsible for developing, maintaining ETL processes and integrating data solutions. Collaborating with teams on data quality and cloud migration initiatives.
Data Engineer optimizing data architectures and pipelines at Nexu. Focused on building reliable and efficient data flows while collaborating with cross - functional teams.
Senior Software Engineer designing and maintaining scalable data solutions for restaurant tech industry at SpotOn. Collaborating with cross - functional teams to enhance reporting and analytics platforms.
Data Architect needed to define and evolve data architecture supporting scientific compute at EIT. Collaborate and lead in large - scale research environments for transformative scientific challenges.