Data Engineer optimizing data ingestion and transformation pipelines for seamless data flow. Collaborating with cross-functional teams using Databricks and other cloud services in a hybrid work setting.
Responsibilities
Ingest data from a variety of sources such as Azure SQL DB, Google Analytics, Google Play Store, Apple App Store, Salesforce, and others.
Develop and optimize ETL/ELT pipelines to transform data from CSV, JSON, SQL tables, and APIs into usable formats.
Work with REST APIs to pull data from various external sources and integrate it into our data ecosystem.
Design and implement efficient data transformation processes to cleanse, aggregate, and enrich data.
Apply industry best practices for data modeling to ensure scalability, performance, and data integrity.
Collaborate with data analysts and data scientists to provide clean, high-quality datasets for reporting and analysis.
Utilize Databricks for data processing, transformation, and orchestration tasks.
Manage and optimize Databricks clusters for performance, reliability, and cost-effectiveness.
Implement Databricks workflows to automate and streamline data pipelines.
Use Unity Catalog for data governance and metadata management, ensuring compliance and data access control.
Requirements
5+ years of hands-on experience in data engineering or a related field.
Proven experience with Databricks and Databricks workflows, including cluster management and data pipeline orchestration.
Strong experience in data ingestion from SQL databases (Azure SQL DB), APIs (Google Analytics, Google Play Store, Apple App Store, Salesforce), and file-based sources (CSV, JSON).
Proficiency in SQL for data manipulation and transformation.
Experience with Python or Scala for writing and managing data workflows.
Working knowledge of REST APIs for data integration.
Experience in data transformation using Apache Spark, Delta Lake, or similar technologies.
Knowledge of cloud platforms such as Azure, with a focus on Azure SQL DB.
Familiarity with Unity Catalog for metadata management and governance.
Understanding of data architecture, data pipelines, and the ETL/ELT process.
Experience in data modeling, optimizing queries, and working with large datasets.
Familiar with data governance, metadata management, and data access controls.
Knowledge of Apache Kafka or other real-time streaming technologies (optional).
Experience with Data Lake or Data Warehouse technologies (optional).
Familiarity with additional data transformation tools such as Apache Airflow or dbt (optional).
Understanding of machine learning workflows and data pipelines (optional).
Lead Data Engineer overseeing engineers and advancing the data platform at American Family Insurance. Creating tools and infrastructure to empower teams across the company.
Data Architect designing end - to - end Snowflake data solutions and collaborating with technical stakeholders at Emerson. Supporting the realization of Data and Digitalization Strategy.
Manager of Data Engineering leading data assets and infrastructure initiatives at CLA. Collaborating with teams to enforce data quality standards and drive integration efforts.
Data Engineer building modern Data Lake architecture on AWS and implementing scalable ETL/ELT pipelines. Collaborating across teams for analytics and reporting on gaming platforms.
Chief Data Engineer leading Scania’s Commercial Data Engineering team for growing sustainable transport solutions. Focused on data products and pipelines for BI, analytics, and AI.
Entry - Level Data Engineer at GM, focusing on building large scale data platforms in cloud environments. Collaborating with data engineers and scientists while migrating systems to cloud solutions.
Data Engineer designing and building scalable ETL/ELT pipelines for enterprise - grade analytics solutions. Collaborating with product teams to deliver high - quality, secure, and discoverable data.
Data Engineer responsible for data integrations with AWS technology stack for Adobe's Digital Experience. Collaborating with multiple teams to conceptualize solutions and improve data ecosystem.
People Data Architect designing and managing people data analytics for Gen, delivering actionable insights for HR. Collaborating across teams to enhance data - driven decision - making.