Principal Data Engineer modernizing cloud-native platforms for AI-powered solutions at Mastercard. Leading teams to enhance data processing efficiency and reliability across global operations.
Responsibilities
Drive modernization from legacy and on-prem systems to modern, cloud-native, and hybrid data platforms.
Architect and lead the development of a Multi-Agent ETL Platform for batch and event streaming, integrating AI agents to autonomously manage ETL tasks such as data discovery, schema mapping, and error resolution.
Define and implement data ingestion, transformation, and delivery pipelines using scalable frameworks (e.g., Apache Airflow, Nifi, dbt, Spark, Kafka, or Dagster).
Leverage LLMs, and agent frameworks (e.g., LangChain, CrewAI, AutoGen) to automate pipeline management and monitoring.
Ensure robust data governance, cataloging, versioning, and lineage tracking across the ETL platform.
Define project roadmaps, KPIs, and performance metrics for platform efficiency and data reliability.
Establish and enforce best practices in data quality, CI/CD for data pipelines, and observability.
Collaborate closely with cross-functional teams (Data Science, Analytics, and Application Development) to understand requirements and deliver efficient data ingestion and processing workflows.
Establish and enforce best practices, automation standards, and monitoring frameworks to ensure the platform’s reliability, scalability, and security.
Build relationships and communicate effectively with internal and external stakeholders, including senior executives, to influence data-driven strategies and decisions.
Continuously engage and improve teams’ performance by conducting recurring meetings, knowing your people, managing career development, and understanding who is at risk.
Oversee deployment, monitoring, and scaling of ETL and agent workloads across multi cloud environments.
Continuously improve platform performance, cost efficiency, and automation maturity.
Requirements
Hands-on experience in data engineering, data platform strategy, or a related technical domain.
Proven experience leading global data engineering or platform engineering teams.
Proven experience in building and modernizing distributed data platforms using technologies such as Apache Spark, Kafka, Flink, NiFi, and Cloudera/Hadoop.
Strong experience with one or more of data pipeline tools (Nifi, Airflow, dbt, Spark, Kafka, Dagster, etc.) and distributed data processing at scale.
Experience building and managing AI-augmented or agent-driven systems will be a plus.
Proficiency in Python, SQL, and data ecosystems (Oracle, AWS Glue, Azure Data Factory, BigQuery, Snowflake, etc.).
Deep understanding of data modeling, metadata management, and data governance principles.
Proven success in leading technical teams and managing complex, cross-functional projects.
Passion for staying current in a fast-paced field with proven ability to lead innovation in a scaled organization.
Excellent communication skills, with the ability to tailor technical concepts to executive, operational, and technical audiences.
Expertise and ability to lead technical decision-making considering scalability, cost efficiency, stakeholder priorities, and time to market.
Proven track leading high-performing teams with experience leading and coaching director level reports and experienced individual contributors.
Advanced degree in Data Science, Computer Science, Information Technology, Business Administration, or a related field. Equivalent experience will also be considered.
Benefits
insurance (including medical, prescription drug, dental, vision, disability, life insurance)
flexible spending account and health savings account
paid leaves (including 16 weeks of new parent leave and up to 20 days of bereavement leave)
80 hours of Paid Sick and Safe Time, 25 days of vacation time and 5 personal days, pro-rated based on date of hire
10 annual paid U.S. observed holidays
401k with a best-in-class company match
deferred compensation for eligible roles
fitness reimbursement or on-site fitness facilities
Snowflake Data Engineer optimizing data pipelines using Snowflake for a global life science company. Collaborate with cross - functional teams for data solutions and performance improvements in Madrid.
Data Engineer designing and implementing big data solutions at DATAIS. Collaborating with clients to deliver actionable business insights and innovative data products in a hybrid environment.
SAP Data Engineer supporting MERKUR GROUP in becoming a data - driven company. Responsible for data integration, ETL processes, and collaboration with various departments.
Big Data Engineer designing and managing data applications on Google Cloud. Join Vodafone’s global tech team to optimize data ingestion and processing for machine learning.
Data Engineer building and maintaining data pipelines for Farfetch’s data platform. Collaborating with the Data team to improve data reliability and architecture in Porto.
Senior Data Engineer at Razer leading initiatives in data engineering and AI infrastructure. Collaborating across teams to develop robust data solutions and enhancing AI/ML projects.
Data Engineering Intern working with data as Jua builds AI for climate and geospatial datasets. Contributing to the integration and validation of new datasets with experienced mentors.
Data Engineer supporting a fintech company in building and maintaining data pipelines. Collaborating with tech teams and enhancing data processing in a high - volume environment.
Staff Engineer developing innovative data solutions for dentsu's B2B marketing vision. Collaborating using cutting - edge cloud technologies and mentoring engineers in their careers.
Senior Data Engineer developing and optimizing data pipelines for Scene+’s cloud - native platform in Toronto. Collaborating across teams to enhance data governance and analytics capabilities.