Data Engineer developing data pipelines and stream processing solutions for Leonardo in the Cyber & Security Solutions area. Supporting data ingestion, processing, and analytics for large-scale datasets.
Responsibilities
Sviluppare data pipelines per ingestion, processing e transformation di grandi volumi di dati
Implementare batch processing jobs con Apache Spark (PySpark, Scala)
Sviluppare real-time data pipelines con Apache Kafka e Apache Flink
Implementare stream processing applications per event transformation, enrichment e aggregation
Orchestrare workflows complessi con Apache Airflow (DAG design, dependencies, scheduling)
Sviluppare trasformazioni analitiche con SQL avanzato e dbt per analytics layers
Sviluppare streaming aggregations con windowing operations (tumbling, sliding, session windows)
Integrare stream processing con batch layers per unified analytics
Implementare exactly-once processing semantics e state management in Flink
Sviluppare Kafka consumers e producers con optimal configuration for throughput
Implementare data quality testing e validation frameworks
Integrare con data lakehouse (Delta Lake, Iceberg) e object storage per data persistence
Implementare stream-to-lake integration per data persistence in lakehouse
Sviluppare data modeling (dimensional, star schema) per analytics e reporting
Collaborare con analytics teams per requirements gathering e data modeling
Ottimizzare performance di Spark jobs, query execution plans e streaming applications per low-latency processing
Implementare incremental processing patterns per efficiency
Implementare monitoring e alerting per streaming pipelines health
Gestire backpressure e failure recovery in streaming applications
Supportare integration con BI tools (Tableau, PowerBI) per reporting
Contribuire a DataOps practices (CI/CD for data pipelines, testing, monitoring) e best practices per stream processing
Requirements
Laurea Magistrale in Ingegneria Informatica, Matematica, Statistica, Fisica, Informatica o equivalente
2 a 5 anni di esperienza nel ruolo, o più di 5 anni di esperienza in ruoli analoghi
Data processing con Apache Spark (PySpark, Scala APIs) per batch workloads
Stream processing con Apache Flink (DataStream API, Table API, SQL)
Senior Data Engineer responsible for evolving data pipeline at Mytra, enhancing supply chain solutions through data - driven insights and collaboration.
Manager, Data Engineering leading the cloud - native data engineering vision at Grainger. Developing scalable platforms and mentoring data engineers to enhance quality and business impact.
Big Data Engineer optimizing scalable data solutions using Hadoop, PySpark, and Hive at Citi. Responsible for building ETL pipelines and ensuring data quality in a hybrid work environment.
Senior Data Engineer in Data Ingestion team at Novo Nordisk, designing scalable data solutions for analytics, AI, and research. Building robust applications and pipelines to support operational use cases.
Data Engineer delivering data for Financial Crime Prevention teams and supporting consistent data layer. Collaborating with multiple teams and defining expected solution details.
Senior consultant at Infosys designing enterprise data solutions and leading technical teams. Collaborating across business pillars in a high - growth consulting environment focused on analytics and data strategy.
Data Engineer I developing data services with Azure technology for global risk management insights. Collaborating with teams to optimize data processes and ensure quality standards.
Senior Data Engineer designing and overseeing data pipelines in Databricks on AWS. Responsible for data quality and performance for enterprise analytics and AI workloads.