Principal Data Engineer modernizing cloud-native platforms for AI-powered solutions at Mastercard. Leading teams to enhance data processing efficiency and reliability across global operations.
Responsibilities
Drive modernization from legacy and on-prem systems to modern, cloud-native, and hybrid data platforms.
Architect and lead the development of a Multi-Agent ETL Platform for batch and event streaming, integrating AI agents to autonomously manage ETL tasks such as data discovery, schema mapping, and error resolution.
Define and implement data ingestion, transformation, and delivery pipelines using scalable frameworks (e.g., Apache Airflow, Nifi, dbt, Spark, Kafka, or Dagster).
Leverage LLMs, and agent frameworks (e.g., LangChain, CrewAI, AutoGen) to automate pipeline management and monitoring.
Ensure robust data governance, cataloging, versioning, and lineage tracking across the ETL platform.
Define project roadmaps, KPIs, and performance metrics for platform efficiency and data reliability.
Establish and enforce best practices in data quality, CI/CD for data pipelines, and observability.
Collaborate closely with cross-functional teams (Data Science, Analytics, and Application Development) to understand requirements and deliver efficient data ingestion and processing workflows.
Establish and enforce best practices, automation standards, and monitoring frameworks to ensure the platform’s reliability, scalability, and security.
Build relationships and communicate effectively with internal and external stakeholders, including senior executives, to influence data-driven strategies and decisions.
Continuously engage and improve teams’ performance by conducting recurring meetings, knowing your people, managing career development, and understanding who is at risk.
Oversee deployment, monitoring, and scaling of ETL and agent workloads across multi cloud environments.
Continuously improve platform performance, cost efficiency, and automation maturity.
Requirements
Hands-on experience in data engineering, data platform strategy, or a related technical domain.
Proven experience leading global data engineering or platform engineering teams.
Proven experience in building and modernizing distributed data platforms using technologies such as Apache Spark, Kafka, Flink, NiFi, and Cloudera/Hadoop.
Strong experience with one or more of data pipeline tools (Nifi, Airflow, dbt, Spark, Kafka, Dagster, etc.) and distributed data processing at scale.
Experience building and managing AI-augmented or agent-driven systems will be a plus.
Proficiency in Python, SQL, and data ecosystems (Oracle, AWS Glue, Azure Data Factory, BigQuery, Snowflake, etc.).
Deep understanding of data modeling, metadata management, and data governance principles.
Proven success in leading technical teams and managing complex, cross-functional projects.
Passion for staying current in a fast-paced field with proven ability to lead innovation in a scaled organization.
Excellent communication skills, with the ability to tailor technical concepts to executive, operational, and technical audiences.
Expertise and ability to lead technical decision-making considering scalability, cost efficiency, stakeholder priorities, and time to market.
Proven track leading high-performing teams with experience leading and coaching director level reports and experienced individual contributors.
Advanced degree in Data Science, Computer Science, Information Technology, Business Administration, or a related field. Equivalent experience will also be considered.
Benefits
insurance (including medical, prescription drug, dental, vision, disability, life insurance)
flexible spending account and health savings account
paid leaves (including 16 weeks of new parent leave and up to 20 days of bereavement leave)
80 hours of Paid Sick and Safe Time, 25 days of vacation time and 5 personal days, pro-rated based on date of hire
10 annual paid U.S. observed holidays
401k with a best-in-class company match
deferred compensation for eligible roles
fitness reimbursement or on-site fitness facilities
Experienced AI Engineer designing and building production - grade agentic AI systems using generative AI and large language models. Collaborating with data engineers, data scientists in a tech - driven company.
Intermediate Data Engineer designing and building data pipelines for travel industry data management. Collaborating across teams to ensure reliable data for analytics and reporting.
Data Engineer managing and organizing datasets for AI models at Walaris, developing AI - driven autonomous systems for defense and security applications.
Data Engineer designing and maintaining data pipelines at Black Semiconductor. Collaborating with process, equipment, and IT teams to support manufacturing analytics and decision - making.
Junior Data Engineer role focusing on Business Intelligence and Big Data at Avanade. Collaborating on data analysis and SQL queries in a supportive learning environment.
GCP Data Engineer designing and developing data processing modules for Ki, an algorithmic insurance carrier. Working closely with multiple teams to optimize data pipelines and reporting.
Data Engineer at Securian Financial optimizing scalable data pipelines for AI and advanced analytics. Collaborating with teams to deliver secure and accessible data solutions.
IT Data Engineering Co‑Op at BlueRock Therapeutics supports development of scientific data systems. Collaboration on data workflows and foundational AWS data engineering tasks.
Data Engineer I building and operationalizing complex data solutions for Travelers' analytics using Databricks. Collaborating within teams to educate end users and support data governance.
Data Engineer shaping modern data architecture to drive golf’s digital transformation. Collaborating with teams to enhance data pipelines and insights for customer engagement and revenue growth.