Senior Data Engineer designing and maintaining data processing pipelines for analytics and machine learning in a fast-paced startup. Collaborating with cross-functional teams to ensure data accuracy and security.
Responsibilities
Design, develop, and maintain scalable data processing pipelines and workflows using frameworks such as Apache Spark, PySpark, and Apache Beam.
Build and maintain microservices in Python that serve data-driven features in production.
Develop internal tools to support CI/CD pipelines, experiment tracking, and data versioning.
Collect, process, and integrate large datasets from multiple sources, including databases, file systems, and APIs.
Ensure data integrity, consistency, and quality through robust validation and monitoring processes.
Optimize data systems for performance, scalability, and high availability.
Implement best practices for data security, access control, and privacy.
Collaborate with data scientists, analysts, and engineers to support analytics and ML workflows.
Requirements
5+ years of professional experience in software engineering or data engineering.
Strong software engineering skills with Python in large-scale, high-performance production environments.
Hands-on experience with Spark/PySpark and other big data frameworks.
Expertise in data modeling and working with both structured and unstructured data.
Hands-on experience with streaming data platforms, particularly Apache Kafka.
Strong understanding of distributed systems and modern data architectures.
Experience working with cloud platforms, preferably GCP (BigQuery, Dataflow, Pub/Sub, Dataproc).
Excellent problem-solving and communication skills.
Benefits
Office Snacks and Activities: Fuel your work with various snacks and enjoy fun activities that keep our team spirit high. Whether it's a darts match, board games, or yoga, we believe a happy team is productive.
Senior Enterprise Data Architect shaping data flow across Lundbeck by unifying enterprise data platforms. Leading architectural design and governance for scalable and sustainable data solutions.
Data Engineer supporting the development and implementation of enterprise - wide data governance practices at a climate technologies company focused on sustainability. Collaborating cross - functionally to enhance data quality and compliance processes.
Data Engineer III at Hanger, Inc. designing and maintaining data solutions using Microsoft Azure. Collaborating with stakeholders and optimizing ETL processes for enterprise analytics.
Data Engineer building scalable data pipelines for analytics at UOL EdTech. Collaborating with data teams and supporting data - driven culture in education technology.
Data Engineering Intern helping build and maintain data pipelines using Python and SQL. Assisting the Data and Analytics team on various data processes and projects.
Data Engineer developing data pipelines and ETL processes for Stefanini's data architecture modernization. Involves data migration from AS400 to Microsoft Fabric Lakehouse.
Senior Data Engineer responsible for overseeing data ingestion and delivery at Kpler. Leading engineering best practices and collaborating with teams on client - facing data solutions.
Senior Data Engineer working on GCP cloud data solutions and ETL processes in AI & Data Engineering team. Collaborating within a hybrid work setup in Bangalore, India.
Lead Data Engineer designing and managing AWS data pipelines and platforms for AI & Data Engineering team. Involves collaborating with data scientists, analysts, and stakeholders for data - driven solutions.
Senior Data Engineer designing and developing scalable data pipelines using DBT and Python. Collaborating with internal stakeholders for analytics and reporting solutions.