AWS Glue Data Engineer at DeepLight AI responsible for data ingestion and pipeline performance optimisation. Collaborate with teams to build scalable solutions in a hybrid work environment.
Responsibilities
***Your responsibilities as the AWS Glue Data Engineer will include:***
**Data Ingestion Development**
Building and implementing AWS Glue jobs for Bronze layer ingestion using defined standards and templates.
Implementing correct loading methods based on source requirements (CDC, full load, delta, snapshot).
Designing and executing historical loading mechanisms to bring legacy data into the Lakehouse.
**Performance Optimisation**
Optimising Glue job performance (DPU allocation, parallelization, partitioning) according to best practices.
Collaborating with platform teams to ensure tooling and optimization alignment.
**Migration & Automation**
Aggressively migrating source tables to Bronze layer, initially using manual approaches with standards/templates, later leveraging AI-enabled acceleration.
Ensuring jobs are version-controlled and production deployment is automated via Git and Terraform.
**Governance & Monitoring**
Implementing source system connectivity into CDP in collaboration with source system owners.
Ensuring jobs comply with data contracts and are properly monitored.
Preparing documentation and handover to operational support teams.
**Collaboration**
Working closely with Data Architect for ingestion patterns and standards.
Coordinating with Data Assurance Lead to apply quality checks across all jobs.
Partnering with platform engineers for tooling and optimisation.
Requirements
***You will have experience in:***
AWS Glue, PySpark, and ETL pipeline development;
substantial knowledge of Lakehouse architecture and Medallion design principles;
familiarity with CDC, delta loads, and historical data ingestion strategies; and;
5+ years experience in data engineering roles, with hands-on experience in AWS Glue.
***You should also have knowledge of:***
AWS services: Glue, S3, Athena, Lambda;
Git, Terraform for CI/CD automation;
data quality frameworks (e.g., Soda Core);
identifying ways to automate their work / repetitive tasks;
working in a fast-paced environment and deliver aggressive migration targets;
collaborating and communication with different stakeholder levels; and;
working with Jira and agile way of working.
Benefits
**Benefits & Growth Opportunities:**
· Competitive salary and performance bonuses
· Comprehensive health insurance
· Professional development and certification support
· Opportunity to work on cutting-edge AI projects
· Flexible working arrangements
· Career advancement opportunities in a rapidly growing AI company
Senior Data Engineer at Keyrus leading the design, development, and delivery of scalable data platforms. Collaborating with teams to translate requirements into production - grade solutions and mentoring engineers.
Senior Data Engineer for global payments platform designing ETL pipelines and data models. Collaborating across teams to tackle complex data challenges in an innovative fintech environment.
Data Warehouse Modelling Engineer designing and maintaining data models using Data Vault 2.0 for iGaming industry. Collaborating with stakeholders and optimizing data models in a hybrid work environment.
Senior Data Engineer driving impactful data solutions for the climate logistics startup HIVED's core data platform. Collaborating with cross - functional squads to enhance analytics and delivery.
Data Engineer developing and maintaining CRE forecasting infrastructure for Cushman & Wakefield. Collaborates with senior economists and technical teams to ensure high - quality data solutions.
Data Engineer at PwC, engaging with Azure cloud services to enhance data handling and integrity. Responsibilities include pipeline optimizations, documentation, and collaboration with stakeholders.
Data Engineer Manager at PwC focusing on building data infrastructure and solutions. Leading data engineering projects to transform raw data into actionable insights and drive business growth.