Data Engineer at Vistra designing and maintaining data pipelines for analytics. Collaborating with teams and optimizing data integration using modern cloud technologies.
Responsibilities
Design and implement scalable ETL/ELT pipelines using AWS services including AWS Glue, Lambda, S3, and Step Functions
Build and optimize data integration processes connecting MySQL databases, APIs, and external data sources to analytical systems and data warehouses
Develop automated data quality monitoring, validation, and cleansing processes
Create and maintain data models, schemas, and documentation to support analytics teams
Implement real-time and batch data processing solutions using serverless architectures
Collaborate with development teams to integrate data collection points into Next.js applications and Node.js services
Build and maintain data analytics APIs and services
Monitor data pipeline performance, troubleshoot issues, and implement proactive alerting and logging mechanisms
Design and implement data backup, archival, and disaster recovery strategies
Work with data analysts and business stakeholders to understand reporting requirements
Requirements
Bachelor’s degree in Computer Science, Data Engineering, Mathematics, or a related technical field
4-6 years of hands-on data engineering experience with strong proficiency in Python for data processing, transformation, and pipeline development
Extensive experience with AWS data services including AWS Glue, Lambda, S3, Athena, Redshift, and Kinesis for building serverless data pipelines
Strong SQL skills and experience with MySQL database design, optimization, and administration including performance tuning and query optimization
Experience with data pipeline orchestration tools such as Apache Airflow, AWS Step Functions, or similar workflow management systems
Proficiency in data formats including JSON, CSV, Parquet, and Avro
Knowledge of data warehousing concepts, dimensional modeling, and analytics best practices for supporting business intelligence requirements
Experience with version control systems, CI/CD pipelines, and infrastructure as code practices for deploying and managing data infrastructure
AWS certifications such as AWS Certified Data Analytics Specialty or AWS Certified Solutions Architect
Experience with streaming data technologies including Apache Kafka, AWS Kinesis, or real-time data processing frameworks
Knowledge of machine learning workflows and experience building data pipelines that support ML model training and inference
Familiarity with business intelligence tools such as Tableau, Power BI, or AWS QuickSight for creating data visualizations and dashboards
Experience with containerization technologies like Docker and orchestration platforms for deploying data processing applications
Understanding of data governance, privacy regulations, and security best practices for handling sensitive data in cloud environments
Experience with NoSQL databases such as DynamoDB, MongoDB, or Elasticsearch for handling unstructured data and high-volume analytics workloads
Benefits
Flexible hybrid working arrangement
Birthday leave
Comprehensive medical insurance and dental coverage
Wellness allowance
Competitive annual leave entitlement
Internal mentorship program
Reimburse professional membership fees for certifications
Intern in Data Analysis and Data Engineering at a startup in Köln, focusing on software engineering and data analytics. Participate in the launch of an interactive sports app.
Snowflake Data Engineer responsible for data pipelines and warehouses for enterprise analytics at Liberty Coca - Cola. Collaborating across business functions to ensure high data quality and performance.
Full - Stack Data Engineer designing and optimizing complex data solutions for automotive content. Collaborating with teams to enhance user experience across MOTOR's product lines.
Principal Data Engineer designing and evolving enterprise data platform. Collaborating with analytics teams to enable AI and data products at American Tower.
BI Data Engineer II supporting scalable Lakehouse data pipelines at Boston Beer Company. Collaborating with stakeholders to drive data ingestion and maintain enterprise data quality.
Senior Data Engineer at A Kube Inc responsible for building and maintaining data pipelines for product performance. Collaborating with product, engineering, and analytics teams to ensure data quality and efficiency.
Data Engineer engineering DUAL Personal Lines’ strategic data platforms for global insurance group. Providing technical expertise in data engineering and collaborating with internal teams for solution delivery.
Data Engineer role focused on creating and monitoring data pipelines in an innovative energy company. Collaborate with IT and departments to ensure quality data availability in a hybrid work environment.