Principal Data Engineer at Trainline shaping robust data foundations for AI and ML-driven products. Collaborate with cross-functional teams to ensure best practices in data engineering and ML.
Responsibilities
Act as a technical authority across multiple teams, setting standards and patterns for data and ML‑adjacent infrastructure
Embed with ML teams to design, build and evolve data platforms supporting AI and ML workloads
Influence technical direction without direct line management responsibility
Partner with Data Engineering teams outside of ML to build a community and share best practices and findings across all areas
Identify systemic issues and proactively drive improvements across the data ecosystem
Look for short term and strategic opportunities to enhance core platforms with new self-serve enablement features for ML and DE
Partnering with MLEs to design data pipelines supporting model training, inference and experimentation
Designing and reviewing architectures for ML‑ready data platforms
Building and optimising data pipelines using SQL, Spark or Ray and Python
Defining best practices for orchestration using Airflow or similar tools
Supporting API‑driven and event‑based data access patterns
Working with AWS infrastructure such as ECS, vector databases and Bedrock APIs
Reviewing designs and code across teams to raise quality and consistency
Coaching engineers through pairing, design reviews and informal mentoring
Collaborating on innovative AI‑powered product features such as the Travel Assistant
Requirements
Extensive experience as a Senior, Staff or Principal Data Engineer operating across teams
Deep expertise in SQL and Python, with strong experience in Spark or similar tooling.
Strong understanding of orchestration tools such as Apache Airflow
Experience designing data platforms for ML and AI workloads
A track record of introducing new technologies and practices, and handling ambiguity and multiple stakeholders.
Hands‑on experience with AWS infrastructure (e.g. ECS, IAM, data storage, compute)
Familiarity with vector databases and modern AI/ML APIs (e.g. Bedrock)
Experience working closely with Machine Learning Engineers in production environments
Strong system design skills and the ability to influence through technical leadership
Senior Data Engineer supporting AI - enabled financial compliance initiative with data pipelines and ingestion processes. Collaborating with diverse teams in a mission - critical regulated environment.
Data Architect leading the definition and construction of cloud data architecture for Kyndryl. Participating in significant technological modernization initiatives, focusing on Google Cloud Platform.
Senior Data Engineer driving data intelligence requirements and scalable data solutions for a global consulting firm. Collaborating across functions to enhance Microsoft architecture and analytics capabilities.
Experienced AI Engineer designing and building production - grade agentic AI systems using generative AI and large language models. Collaborating with data engineers, data scientists in a tech - driven company.
Intermediate Data Engineer designing and building data pipelines for travel industry data management. Collaborating across teams to ensure reliable data for analytics and reporting.
Data Engineer managing and organizing datasets for AI models at Walaris, developing AI - driven autonomous systems for defense and security applications.
Data Engineer designing and maintaining data pipelines at Black Semiconductor. Collaborating with process, equipment, and IT teams to support manufacturing analytics and decision - making.
Junior Data Engineer role focusing on Business Intelligence and Big Data at Avanade. Collaborating on data analysis and SQL queries in a supportive learning environment.
GCP Data Engineer designing and developing data processing modules for Ki, an algorithmic insurance carrier. Working closely with multiple teams to optimize data pipelines and reporting.
Data Engineer at Securian Financial optimizing scalable data pipelines for AI and advanced analytics. Collaborating with teams to deliver secure and accessible data solutions.