Design and implement complex, scalable data systems using technologies such as cloud computing, big data, streaming, machine learning and AI, defining strategies for data storage, processing and analysis. Incorporate Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) and vector databases for intelligent data processing.
Lead and inspire the data engineering team by sharing knowledge, defining standards and best practices, and fostering the team's technical growth.
Identify and resolve complex problems related to data systems, architecture, performance and scalability, using monitoring tools, log analysis and code debugging.
Create and implement innovative solutions to optimize data systems, leveraging emerging technologies such as machine learning, data lakes, real-time data streaming and big data analytics.
Communicate technical solutions clearly and concisely to stakeholders, managers, developers and other teams, influencing strategic decisions and helping align the data strategy with company objectives.
Requirements
Deep proficiency in Python and data processing tools such as Apache Spark, Kafka, Flink or equivalents.
Knowledge of LLMs, fine-tuning AI models and using RAG to improve search and information retrieval. Proficiency with Amazon Bedrock and/or Azure OpenAI services.
Experience with data system architecture, big data (Hadoop, Hive, HBase), streaming (Kafka, Kinesis) and databases (SQL, NoSQL).
Advanced knowledge of data modeling, data processing and data visualization tools such as SQL, NoSQL, Tableau, Power BI, etc.
Advanced experience building Data Lakes and Data Warehouses using tools like AWS Athena, Amazon Redshift, Amazon S3, AWS Glue, Airbyte, dbt and PostgreSQL.
Deep understanding of machine learning concepts and applying ML techniques within data systems.
Bachelor's degree in Computer Science, Statistics, Mathematics or a related field.
Technical English for reading, writing and professional communication.
Senior Data Engineer at Clorox developing cloud - based data solutions. Leading data engineering projects and collaborating with business stakeholders to optimize data flows.
Data Engineer building solutions on AWS for high - performance data processing. Leading initiatives in data architecture and analytics for operational support.
Senior Data Engineer overseeing Databricks platform integrity, optimizing data practices for efficient usage. Leading teams on compliance while mentoring a junior Data Engineer.
Associate Data Engineer contributing to software applications development and maintenance using Python. Collaborating with teams for clean coding and debugging practices in Pune, India.
Lead Data Engineer responsible for delivering scalable cloud - based data solutions and managing cross - functional teams. Collaborating with global stakeholders and ensuring high - quality project execution in a fast - paced environment.
Data Engineer focusing on development and optimization of data pipelines in an insurance context. Ensuring data integrity and supporting data - driven decision - making processes.
Full Stack Data Engineer on a Central Engineering Portfolio Team in Chennai delivering curated data products and collaborating with data engineers and product owners.
Data Engineer designing and implementing data pipelines and services for Ford Pro analytics. Working with diverse teams and technologies to drive data - driven solutions.
Data Engineer developing best - in - class data platforms for ClearBank with a focus on data insights and automation. Collaborating closely with stakeholders and supporting data science initiatives.
Data Engineer operating cloud - based data platform for Business Intelligence and Data Science. Collaborating on data architectures and ETL processes for Sparkassen - Finanzgruppe.