Data Architect designing and modernizing large-scale, cloud-native data platforms. Focused on distributed processing, real-time pipelines, automation, and GenAI enablement at a data management company.
Responsibilities
Architect and govern enterprise Big Data platforms (data lake, lakehouse, warehouse, real-time).
Design high-volume, high-velocity data pipelines using batch and streaming frameworks.
Lead implementation of distributed processing architectures (Spark, PySpark, EMR).
Build event-driven and real-time streaming solutions (Kafka, Kinesis, Flink).
Define ETL/ELT patterns, metadata-driven pipelines, and reusable ingestion frameworks.
Drive data platform automation (Airflow/Step Functions, CI/CD, data quality, observability).
Optimize performance, scalability, fault tolerance, and cost across Big Data workloads.
Integrate GenAI architectures (LLMs, embeddings, vector databases, RAG) with enterprise data lakes.
Ensure security, governance, lineage, and compliance across data platforms.
Provide hands-on leadership and technical mentoring to data engineering teams.
Requirements
12+ years in Big Data Engineering / Data Architecture roles.
Expert-level experience with Spark, PySpark, SQL, and distributed compute engines.
Strong knowledge of AWS Big Data stack: S3, EMR, Glue, Athena, Redshift, Lambda, Step Functions.
Hands-on experience with Snowflake (performance tuning, data sharing, optimization).
Expertise in streaming platforms: Kafka, Kinesis, Flink, or Spark Streaming.
Strong experience with data modeling (dimensional, Data Vault 2.0).
Proficiency in Python, schema evolution, partitioning, and data versioning.
Experience with orchestration and automation tools (Airflow, Dagster, CI/CD).
Working knowledge of GenAI data integration (feature stores, vector DBs, RAG pipelines).
Experience with Agile delivery and leading globally distributed engineering teams.
Health Data Engineer organizing big data and implementing data engineering activities for mission - driven healthcare projects at Booz Allen. Collaborating with a multidisciplinary team in a fast - paced environment.
Data Engineer responsible for developing and maintaining data engineering solutions at Ørsted. Collaborating with stakeholders and modernising database architecture for future needs.
Senior Azure Data Engineer at Accellor enhancing data capabilities for enterprise clients using Azure technologies. Building solutions to transform business processes through robust data engineering practices.
Data Architect at Evertec designing and implementing data architectures in Sao Paulo. Collaborating with cross - functional teams and ensuring data governance and quality practices.
Data Engineer Senior creating scalable data pipelines and deploying ML models. Collaborating within a passionate team at Norsys focusing on IT engineering and consulting.
Senior Data Engineer designing scalable ETL data pipelines using Databricks for a software consulting company. Collaborating with teams to implement robust data solutions in diverse business environments.
Data Engineer designing and optimizing data solutions for Cerium clients. Leveraging advanced architectures for data estates and providing strategic insights through analytics.
Data Engineer responsible for architecting and building data solutions for clients at Cerium Networks. Collaborating with teams to leverage advanced data architectures and ensure client satisfaction.
Data Engineer at Cerium Networks designing and optimizing data solutions for clients. Employing advanced architectures and providing consulting services while supporting client integrations.