Data Engineer leading data foundation architecture and optimization for a Kenyan startup. Constructing data pipelines that fuel machine learning models and internal analytics.
Responsibilities
Architect and sustain scalable ETL workflows, guaranteeing consistency and accuracy across diverse data origins.
Refine and optimize data models and database structures specifically tailored for reporting and analytics.
Enforce industry best practices regarding data warehousing and storage methodologies.
Fine-tune data systems to handle the demands of both real-time streams and batch processing.
Manage the cloud data environment, utilizing platforms such as AWS, Azure, or GCP.
Coordinate with software engineers to embed data solutions directly into our product suite.
Design robust processes for ingesting both structured and unstructured datasets.
Script automated quality checks and deploy monitoring instrumentation to instantly detect data anomalies.
Build APIs and services that ensure seamless data interoperability between systems.
Continuously monitor pipeline health, troubleshooting bottlenecks to maintain an uninterrupted data flow.
Embed data governance and security protocols that meet rigorous industry standards.
Collaborate with data scientists and analysts to maximize the usability and accessibility of our data assets.
Maintain comprehensive documentation covering schemas, transformations, and pipeline architecture.
Keep a pulse on emerging trends in cloud tech, analytics, and data engineering to drive continuous improvement.
Requirements
A minimum of 3 years of professional experience in Data Engineering or a similar technical role.
Bachelor’s or Master’s degree in Engineering, Computer Science, Data Science, or a relevant discipline.
Expert-level command of SQL and management systems like PostgreSQL or MySQL.
Hands-on proficiency with pipeline tools such as Luigi, DBT, or Apache Airflow.
Practical experience with heavy-lifting technologies like Hadoop, Spark, or Kafka.
Proven skills with cloud data stacks, specifically Google BigQuery, AWS Redshift, or Azure Data Factory.
Strong programming logic in Java, Scala, or Python for data processing tasks.
Familiarity with data integration frameworks and API utilization.
Understanding of security best practices and compliance frameworks.
Data Engineer II leading development and delivery of data pipelines for Syneos Health. Collaborating with teams to optimize data processing and integrate solutions into production environments.
Lead Data Engineer overseeing data operations and analytics engineering teams for OneOncology. Focused on operational excellence in data platform and model reliability for cancer care improvement.
Senior AWS Software Data Engineer at Boeing focusing on AWS Data services to support digital analytics capabilities. Collaborating with cross - functional teams to design, develop, and maintain software data solutions.
Senior Data Engineer designing and improving software for business capabilities at Barclays. Collaborating with teams to build a data and intelligence platform for Equity Derivatives.
Senior AI & Data Engineer developing and implementing AI solutions in collaboration with clients and teams. Working on projects involving generative AI, predictive analytics, and data mastery.
Consultant driving IA business growth in Deloitte's Artificial Intelligence & Data team. Delivering innovative solutions using data analytics and automation technologies.
Data Engineer responsible for managing data architecture and pipelines at Snappi, a neobank. Collaborating with teams to enable data processing and analysis in innovative banking solutions.
Data Engineer at Destinus developing the data platform to support production and analytics needs. Involves migrating Excel sources to Lakehouse and integrating ERP systems in a hybrid role.
Senior Data Engineer developing solutions within the Global Specialty portfolio at an insurance company. Engaging with diverse business partners to ensure high quality data reporting.
Data Engineer at UBDS Group focusing on designing and optimizing modern data platforms. Collaborating in a multidisciplinary team to develop reliable data assets for analytics and operational use cases.