Principal Data Engineer modernizing cloud-native platforms for AI-powered solutions at Mastercard. Leading teams to enhance data processing efficiency and reliability across global operations.
Responsibilities
Drive modernization from legacy and on-prem systems to modern, cloud-native, and hybrid data platforms.
Architect and lead the development of a Multi-Agent ETL Platform for batch and event streaming, integrating AI agents to autonomously manage ETL tasks such as data discovery, schema mapping, and error resolution.
Define and implement data ingestion, transformation, and delivery pipelines using scalable frameworks (e.g., Apache Airflow, Nifi, dbt, Spark, Kafka, or Dagster).
Leverage LLMs, and agent frameworks (e.g., LangChain, CrewAI, AutoGen) to automate pipeline management and monitoring.
Ensure robust data governance, cataloging, versioning, and lineage tracking across the ETL platform.
Define project roadmaps, KPIs, and performance metrics for platform efficiency and data reliability.
Establish and enforce best practices in data quality, CI/CD for data pipelines, and observability.
Collaborate closely with cross-functional teams (Data Science, Analytics, and Application Development) to understand requirements and deliver efficient data ingestion and processing workflows.
Establish and enforce best practices, automation standards, and monitoring frameworks to ensure the platform’s reliability, scalability, and security.
Build relationships and communicate effectively with internal and external stakeholders, including senior executives, to influence data-driven strategies and decisions.
Continuously engage and improve teams’ performance by conducting recurring meetings, knowing your people, managing career development, and understanding who is at risk.
Oversee deployment, monitoring, and scaling of ETL and agent workloads across multi cloud environments.
Continuously improve platform performance, cost efficiency, and automation maturity.
Requirements
Hands-on experience in data engineering, data platform strategy, or a related technical domain.
Proven experience leading global data engineering or platform engineering teams.
Proven experience in building and modernizing distributed data platforms using technologies such as Apache Spark, Kafka, Flink, NiFi, and Cloudera/Hadoop.
Strong experience with one or more of data pipeline tools (Nifi, Airflow, dbt, Spark, Kafka, Dagster, etc.) and distributed data processing at scale.
Experience building and managing AI-augmented or agent-driven systems will be a plus.
Proficiency in Python, SQL, and data ecosystems (Oracle, AWS Glue, Azure Data Factory, BigQuery, Snowflake, etc.).
Deep understanding of data modeling, metadata management, and data governance principles.
Proven success in leading technical teams and managing complex, cross-functional projects.
Passion for staying current in a fast-paced field with proven ability to lead innovation in a scaled organization.
Excellent communication skills, with the ability to tailor technical concepts to executive, operational, and technical audiences.
Expertise and ability to lead technical decision-making considering scalability, cost efficiency, stakeholder priorities, and time to market.
Proven track leading high-performing teams with experience leading and coaching director level reports and experienced individual contributors.
Advanced degree in Data Science, Computer Science, Information Technology, Business Administration, or a related field. Equivalent experience will also be considered.
Benefits
insurance (including medical, prescription drug, dental, vision, disability, life insurance)
flexible spending account and health savings account
paid leaves (including 16 weeks of new parent leave and up to 20 days of bereavement leave)
80 hours of Paid Sick and Safe Time, 25 days of vacation time and 5 personal days, pro-rated based on date of hire
10 annual paid U.S. observed holidays
401k with a best-in-class company match
deferred compensation for eligible roles
fitness reimbursement or on-site fitness facilities
Data Engineer building modern Data Lake architecture on AWS and implementing scalable ETL/ELT pipelines. Collaborating across teams for analytics and reporting on gaming platforms.
Chief Data Engineer leading Scania’s Commercial Data Engineering team for growing sustainable transport solutions. Focused on data products and pipelines for BI, analytics, and AI.
Data Engineer designing and building scalable ETL/ELT pipelines for enterprise - grade analytics solutions. Collaborating with product teams to deliver high - quality, secure, and discoverable data.
Entry - Level Data Engineer at GM, focusing on building large scale data platforms in cloud environments. Collaborating with data engineers and scientists while migrating systems to cloud solutions.
Data Engineer responsible for data integrations with AWS technology stack for Adobe's Digital Experience. Collaborating with multiple teams to conceptualize solutions and improve data ecosystem.
People Data Architect designing and managing people data analytics for Gen, delivering actionable insights for HR. Collaborating across teams to enhance data - driven decision - making.
Data Engineer role focused on shaping future connectivity for customers at Vodafone. Involves solving complex challenges in a diverse and inclusive environment.
VP, Senior Data Engineer responsible for designing and developing cloud data solutions for insider risk in Information Security at SMBC. Collaborating with multiple teams to enhance cybersecurity data platform.
Data Engineer responsible for architecting, developing, and maintaining Allegiant’s enterprise data infrastructure. Overseeing transition to cloud hosted data warehouse and developing next - generation data tools.
Senior Data Engineer developing Azure - based data solutions for clients in the Data & AI department. Collaborating with architects and consultants to enhance automated decision making.