Data Engineer designing and building production data pipelines for AI and ML workloads at Capgemini Engineering. Focus on end to end data lifecycle management and AWS infrastructure.
Responsibilities
Design, build, and maintain research and production data pipelines spanning edge devices, cloud services, and centralized platforms
Own the full data lifecycle including collection, ingestion, processing, obfuscation, versioning, access, retention, and retirement
Develop resilient ingestion pipelines that handle device variability and connectivity challenges
Support secure data transfer from field environments to cloud storage
Collaborate with operations teams to improve data coverage, observability, and reliability
Implement privacy preserving transformations and obfuscation pipelines
Build automated data cleaning and validation processes
Establish data lineage, retention policies, and access controls to ensure compliance and traceability
Provide scalable data services for training, evaluation, and research experimentation
Support continuous data refresh and retraining workflows
Build and optimize pipelines using AWS services such as S3, EC2, SageMaker, Lambda, Glue, and Step Functions
Requirements
Bachelor’s or master’s degree in computer science, data engineering, software engineering, or related field
2-3+ years of experience building production data pipelines and data platforms for AI or ML systems
Strong proficiency in Python, C++ and distributed data processing frameworks
Hands on experience with AWS services including S3, EC2, SageMaker, and Glue
Experience designing data systems that support large scale ML training and experimentation
Knowledge of data governance, access control, and lifecycle management
Experience working with ML, data science, operations, and cloud engineering teams
Benefits
Health insurance from the first days
Christmas holidays from 25 December to 31 December
Cooperation with Superhumans center and Veteran HUB
Psychological counseling provided by the Veteran Hub
Senior Data Engineer designing impactful data solutions for clients at Simple Machines. Collaborating with engineers to build data platforms and pipelines in a hybrid workplace.
Journeyman Data Engineer at Leidos supporting DoD enterprise data and analytics. Develop and maintain data pipelines and data models with a focus on national security outcomes.
Data Engineer responsible for designing and implementing data pipelines at United Community. Collaborating across teams to support data warehouse and maintenance of data products.
Senior Data Engineer at Corient designing and maintaining data pipelines for wealth management. Overseeing sprint planning and supporting cross - functional data initiatives to ensure data integrity.
Data Engineer designing and implementing scalable data architecture for HR and people analytics. Collaborating with teams to ensure reliable data pipelines and integration using modern technologies.
Senior Data Engineer architecting and maintaining scalable data systems while collaborating with cross - functional teams at SpotOn, aimed at empowering independent restaurants.
Analytics & Data Engineer joining a data - driven team at Adlook, building data products and automating data pipelines. Collaborate across teams to enhance data analysis and AI functionality.
Data Engineer building data foundations for People Analytics at Notion. Designing data systems and collaborating with People leadership to enhance workforce decision - making.
T&T Consultant driving the growth of AI & Data solutions at Deloitte Southeast Asia. Leading client transformations through data, analytics, and intelligent automation.
Lead the Artificial Intelligence & Data team at Deloitte Southeast Asia. Drive IA business growth and deliver innovative solutions in data and analytics.