Perception Data Engineer at ANYbotics building data pipelines for perception models in mobile robotics. Collaborating within a global team on cutting-edge robotic technology.
Responsibilities
Build and operate the data plumbing that our perception models need: ingestion, versioned storage, ETL, labeling integration, and reliable production pipelines for training and inference.
Design, build and maintain scalable data pipelines and ETL workflows that ingest raw images, sensor metadata, and labels (both real and synthetic).
Implement dataset versioning, schema management, and reproducible data snapshots to support experiments and audits.
Integrate annotation tools (CVAT / Label Studio), manage labeling workflows and quality-control tooling, and support label QA processes.
Build data validation and monitoring checks (file integrity, label sanity, distribution drift alerts) and automate remediation where possible.
Provide clean, ready-to-use datasets and data loaders for ML engineers; optimize data access patterns for training (sharding, caching, prefetching).
Requirements
3+ years engineering experience building production data pipelines or ETL systems.
Strong Python scripting and engineering skills (pandas, pyarrow, boto3 or equivalent).
Experience with dataset versioning or large-file management (DVC, Git-LFS, or similar) and cloud object storage (S3).
Familiarity with annotation tooling and workflows for image data (CVAT / Label Studio).
Basic understanding of ML training data needs (batching, sharding, augmentation integration).
Prior work supporting computer-vision teams (image pipelines, preprocessing, TFRecord or custom dataset formats).
Data Warehouse Modelling Engineer designing and maintaining data models using Data Vault 2.0 for iGaming industry. Collaborating with stakeholders and optimizing data models in a hybrid work environment.
Senior Data Engineer driving impactful data solutions for the climate logistics startup HIVED's core data platform. Collaborating with cross - functional squads to enhance analytics and delivery.
Data Engineer developing and maintaining CRE forecasting infrastructure for Cushman & Wakefield. Collaborates with senior economists and technical teams to ensure high - quality data solutions.
Data Engineer at PwC, engaging with Azure cloud services to enhance data handling and integrity. Responsibilities include pipeline optimizations, documentation, and collaboration with stakeholders.
Data Engineer Manager at PwC focusing on building data infrastructure and solutions. Leading data engineering projects to transform raw data into actionable insights and drive business growth.
Junior Data Engineer at OneMarketData focusing on data quality and integrity in financial datasets. Collaborating with senior analysts and assisting in data management and analysis tasks.
Senior Data Engineering Analyst developing and implementing data solutions. Collaborating in a diverse environment focused on data processing and analysis for clients' digital transformation.
Principal Software Engineer in Threat Data Platform developing AI - driven tools for threat intelligence automation. Collaborating on robust data pipelines for PANW’s product ecosystem.
Senior Azure Data Engineer maintaining business intelligence solutions for Grupo Gloria, implementing and stabilizing projects in Azure and Databricks with Power BI reporting.