Implementing ingestion pipelines, using Airflow as the orchestration platform, for consuming data from a wide variety of sources (API, SFTP, Cloud Storage Bucket, etc.).
Implementing transformation pipelines using software engineering best practices and tools (DBT)
Working closely with Software Engineering and DevOps to maintain reproducible infrastructure and data that serves both API-only customers and in-house SaaS products
Defining and implementing data ingestion/transformation quality control processes using established frameworks (Pytest, DBT)
Building pipelines that use multiple technologies and cloud environments (for example, an Airflow pipeline pulling a file from an S3 bucket and loading the data into BigQuery)
Create and ensure data automation stability with associated monitoring tools.
Review existing and proposed infrastructure for architectural enhancements that follow both software engineering and data analytics best practices.
Working closely with Data Science and facilitating advanced data analysis (like Machine Learning)
Requirements
Strong working knowledge of Apache Airflow
Experience supporting a SaaS or DaaS product, bonus points if you were creating new data products/features
Strong in Linux environments and experience in scripting languages Python
Expert Strong understanding of software best practices and associated tools.
Experience in any major RDBMS (MySQL, Postgres, SQL Server, etc.).
Strong SQL Skills, bonus points for having used both T-SQL and Standard SQL
Experience with NoSQL (Elasticsearch, MongoDB, etc.)
Multi-cloud and/or hybrid-cloud experience
Strong interpersonal skills
Comfortable working directly with data providers, including non-technical individuals
Experience with the following (or transitioning from equivalent platform services): Cloud Storage, Cloud Pubsub, BigQuery, Apache Airflow, dbt, DataFlow
Data Engineer at Kyndryl transforming raw data into actionable insights using ELK Stack. Responsible for developing, implementing, and maintaining data pipelines and processing workflows.
Senior Data Engineer at Clorox developing cloud - based data solutions. Leading data engineering projects and collaborating with business stakeholders to optimize data flows.
Data Engineer building solutions on AWS for high - performance data processing. Leading initiatives in data architecture and analytics for operational support.
Senior Data Engineer overseeing Databricks platform integrity, optimizing data practices for efficient usage. Leading teams on compliance while mentoring a junior Data Engineer.
Associate Data Engineer contributing to software applications development and maintenance using Python. Collaborating with teams for clean coding and debugging practices in Pune, India.
Lead Data Engineer responsible for delivering scalable cloud - based data solutions and managing cross - functional teams. Collaborating with global stakeholders and ensuring high - quality project execution in a fast - paced environment.
Data Engineer focusing on development and optimization of data pipelines in an insurance context. Ensuring data integrity and supporting data - driven decision - making processes.
Data Engineer designing and implementing data pipelines and services for Ford Pro analytics. Working with diverse teams and technologies to drive data - driven solutions.
Full Stack Data Engineer on a Central Engineering Portfolio Team in Chennai delivering curated data products and collaborating with data engineers and product owners.
Data Engineer developing best - in - class data platforms for ClearBank with a focus on data insights and automation. Collaborating closely with stakeholders and supporting data science initiatives.