Senior Data Engineer leading the design and optimization of scalable data architectures on AWS. Collaborating on complex data pipelines and mentoring junior engineers.
Responsibilities
Lead the design and development of scalable, high-performance data architectures on AWS, leveraging services such as S3, EMR, Glue, Redshift, Lambda, and Kinesis.
Architect and manage Data Lakes for handling structured, semi-structured, and unstructured data.
Design and build complex data pipelines using Apache Spark (Scala & PySpark), Kafka Streams (Java), and cloud-native technologies for batch and real-time data processing.
Optimize these pipelines for high performance, scalability, and cost-effectiveness.
Develop and optimize real-time data streaming applications using Kafka Streams in Java.
Build reliable, low-latency streaming solutions to handle high-throughput data, ensuring smooth data flow from sources to sinks in real time.
Manage Snowflake for cloud data warehousing, ensuring seamless data integration, optimization of queries, and advanced analytics.
Implement Apache Iceberg in Data Lakes for managing large-scale datasets with ACID compliance, schema evolution, and versioning.
Design and maintain highly scalable Data Lakes on AWS using S3, Glue, and Apache Iceberg.
Work with business stakeholders to create actionable insights using Tableau.
Build data models and dashboards that drive key business decisions, ensuring that data is easily accessible and interpretable.
Requirements
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field (or equivalent work experience).
5+ years of experience in Data Engineering or a related field, with a proven track record of designing, implementing, and maintaining large-scale distributed data systems.
Proficiency in Apache Spark (Scala & PySpark) for distributed data processing and real-time analytics.
Hands-on experience with Kafka Streams using Java for real-time data streaming applications.
Strong experience in Data Lake architectures on AWS, using services like S3, Glue, EMR, and data management platforms like Apache Iceberg.
Proficiency in Snowflake for cloud-based data warehousing, data modeling, and query optimization.
Expertise in SQL for querying relational and NoSQL databases, and experience with database design and optimization.
Benefits
Lead and mentor junior engineers, fostering a culture of collaboration, continuous learning, and technical excellence.
Ensure high-quality code delivery, adherence to best practices, and optimal use of resources.
Data Engineer developing and maintaining ETL processes with Informatica IDMC for BMW TechWorks Romania. Collaborating with teams to implement cloud - based data architecture solutions.
Microsoft Data Architect designing and building business intelligence solutions on Azure. Collaborating with teams and engaging clients for strategic data solutions.
Senior Data Engineer developing data - pipelines and deploying AI tools for Mariner's ecosystem. Collaborating with cross - departmental teams to enhance enterprise data products and systems integrations.
Data Architect for Amazon leading data strategy and advanced analytics solutions. Collaborating with cross - functional teams to drive measurable value and enhance data literacy across the organization.
Data Platform Engineer at Taxfix managing data infrastructures and pipelines to support analytics and AI - driven product features. Collaborating with cross - functional teams to ensure data reliability.
Mid - level Data Engineer building secure, scalable data infrastructure on AWS for fintech startup TrueLayer. Collaborating with DevOps and Analytics teams to enhance data operations.
Data Architect responsible for enterprise data architecture and solutions at NMC. Collaborating with various teams while evolving Microsoft D365 data architecture.
Senior Data Engineer developing robust data pipelines and optimizing databases for Atecna clients. Contributing to the growth of Atecna’s Data community with expertise.
Data Scientist role at CRH Talento de IT focused on data architecture and analytics. Involves collaboration with development, BI, and data governance teams while working hybrid.