Data Engineer optimizing data ingestion and transformation pipelines for seamless data flow. Collaborating with cross-functional teams using Databricks and other cloud services in a hybrid work setting.
Responsibilities
Ingest data from a variety of sources such as Azure SQL DB, Google Analytics, Google Play Store, Apple App Store, Salesforce, and others.
Develop and optimize ETL/ELT pipelines to transform data from CSV, JSON, SQL tables, and APIs into usable formats.
Work with REST APIs to pull data from various external sources and integrate it into our data ecosystem.
Design and implement efficient data transformation processes to cleanse, aggregate, and enrich data.
Apply industry best practices for data modeling to ensure scalability, performance, and data integrity.
Collaborate with data analysts and data scientists to provide clean, high-quality datasets for reporting and analysis.
Utilize Databricks for data processing, transformation, and orchestration tasks.
Manage and optimize Databricks clusters for performance, reliability, and cost-effectiveness.
Implement Databricks workflows to automate and streamline data pipelines.
Use Unity Catalog for data governance and metadata management, ensuring compliance and data access control.
Requirements
5+ years of hands-on experience in data engineering or a related field.
Proven experience with Databricks and Databricks workflows, including cluster management and data pipeline orchestration.
Strong experience in data ingestion from SQL databases (Azure SQL DB), APIs (Google Analytics, Google Play Store, Apple App Store, Salesforce), and file-based sources (CSV, JSON).
Proficiency in SQL for data manipulation and transformation.
Experience with Python or Scala for writing and managing data workflows.
Working knowledge of REST APIs for data integration.
Experience in data transformation using Apache Spark, Delta Lake, or similar technologies.
Knowledge of cloud platforms such as Azure, with a focus on Azure SQL DB.
Familiarity with Unity Catalog for metadata management and governance.
Understanding of data architecture, data pipelines, and the ETL/ELT process.
Experience in data modeling, optimizing queries, and working with large datasets.
Familiar with data governance, metadata management, and data access controls.
Knowledge of Apache Kafka or other real-time streaming technologies (optional).
Experience with Data Lake or Data Warehouse technologies (optional).
Familiarity with additional data transformation tools such as Apache Airflow or dbt (optional).
Understanding of machine learning workflows and data pipelines (optional).
Senior Data Engineer supporting AI - enabled financial compliance initiative with data pipelines and ingestion processes. Collaborating with diverse teams in a mission - critical regulated environment.
Data Architect leading the definition and construction of cloud data architecture for Kyndryl. Participating in significant technological modernization initiatives, focusing on Google Cloud Platform.
Senior Data Engineer driving data intelligence requirements and scalable data solutions for a global consulting firm. Collaborating across functions to enhance Microsoft architecture and analytics capabilities.
Experienced AI Engineer designing and building production - grade agentic AI systems using generative AI and large language models. Collaborating with data engineers, data scientists in a tech - driven company.
Intermediate Data Engineer designing and building data pipelines for travel industry data management. Collaborating across teams to ensure reliable data for analytics and reporting.
Data Engineer managing and organizing datasets for AI models at Walaris, developing AI - driven autonomous systems for defense and security applications.
Data Engineer designing and maintaining data pipelines at Black Semiconductor. Collaborating with process, equipment, and IT teams to support manufacturing analytics and decision - making.
Junior Data Engineer role focusing on Business Intelligence and Big Data at Avanade. Collaborating on data analysis and SQL queries in a supportive learning environment.
GCP Data Engineer designing and developing data processing modules for Ki, an algorithmic insurance carrier. Working closely with multiple teams to optimize data pipelines and reporting.
Data Engineer at Securian Financial optimizing scalable data pipelines for AI and advanced analytics. Collaborating with teams to deliver secure and accessible data solutions.