Staff Data Engineer responsible for data strategy and pipeline management at PayJoy. Ensuring quality data and leading engineering best practices for organizational efficiency.
Responsibilities
Architect and Build Data Pipelines: Build, optimize, and maintain reliable, scalable, and efficient data pipelines for both batch and real-time data processing.
Data Strategy: Develop and maintain a data strategy aligned with business objectives, ensuring data infrastructure supports current and future needs.
Streaming Expertise: Lead the development of real-time ingestion pipelines using Kafka/Kinesis, and design data models optimized for streaming workloads.
Data Quality & Governance: Implement data quality checks, schema evolution, lineage tracking, and compliance using tools like Unity Catalog and Delta Lake etc.
Tool & Technology Selection: Evaluate and implement the latest data engineering tools and technologies that will best serve our needs, balancing innovation with practicality.
Automation and CI/CD: Drive automation of pipeline deployments, testing and monitoring using Terraform, CircleCi or similar tools.
Performance Tuning: Regularly review, refine, and optimize SQL queries across different systems to maintain peak performance. Identify and address bottlenecks, query performance issues, and resource utilization. Setup best practices and work with developers on education of what they should be doing in the software development lifecycle to ensure optimal performance.
Database Administration: Manage and maintain production AWS RDS MySQL, Aurora and postgres databases. Perform routine database operations, including backups, restores, and disaster recovery planning. Monitor database health, diagnose and resolve issues in a timely manner.
Knowledge and Training: Serve as the primary point of contact for database performance and usage related knowledge, providing guidance, training, and expertise to other teams and stakeholders.
Monitoring & Troubleshooting: Implement monitoring solutions to ensure high availability and troubleshoot data pipeline issues in real-time.
Documentation: Maintain comprehensive documentation of systems, pipelines, and processes for easy onboarding and collaboration.
Mentorship & Leadership: Mentor other engineers, review PRs, and establish best practices in data engineering.
Requirements
Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field.
12+ years of experience in data engineering, with at least 3+ years working in Databricks.
Deep hands-on experience with Apache Spark (PySpark/SQL), Delta Lake, and Structured Streaming.
Technical Expertise: Deep understanding of data engineering concepts, including ETL/ELT processes, data warehousing, big data technologies, and cloud platforms (e.g., AWS, Azure, GCP).
Strong proficiency in Python, SQL, and data modeling for both OLTP and OLAP systems.
Architectural Knowledge: Strong experience in designing and implementing data architectures, including real-time data processing, data lakes, and data warehouses.
Tool Proficiency: Hands-on experience with data engineering tools such as Apache Spark, Kafka, Databricks, Airflow, and modern data orchestration frameworks.
Innovation Mindset: A track record of implementing innovative solutions and reimagining data engineering practices.
Experience with Databricks Workflows, Delta Live Tables (DLT), and Unity Catalog.
Familiarity with stream processing patterns (exactly-once semantics, watermarking, checkpointing)
Benefits
100% Company-funded health, dental, and vision insurance for employee and immediate family
Company-funded employee life and disability insurance
3% employer 401k contribution
Company holidays; 20 days vacations; flexible sick leave
Headphone, home office equipment and wellness perks.
Data Engineer/Analyst maintaining and improving data infrastructure for Braiins. Collaborating with technical and business teams to ensure reliable data flows and insights.
Medior Data Engineer handling Azure migrations for a major urban mobility client. Focused on data pipeline development and ensuring platform reliability with cutting - edge technologies.
Developing ML and computer vision solutions for cutting - edge autonomous vehicle dataset pipeline at Mobileye. Collaborating across teams for data curation and advanced perception algorithms.
Data Migration Lead in a hybrid role managing data migration for a major transformation programme in the media sector. Collaborating with various teams to ensure data integrity and successful migration.
Consultant ML & DataOps at Smile integrating data science projects for major clients. Designing MLOps solutions and enhancing data governance in a collaborative environment.
Data Engineer developing and maintaining data pipelines for Coolbet’s analytical services. Working within an Agile framework to ensure data reliability and efficiency.
API Data Engineer developing innovative data - driven solutions and advancing data architecture for AI Control Tower. Building and integrating APIs and data pipelines to support organizational needs.
Journeyman Data Architect supporting Leidos' enterprise data and analytics program for the Department of War. Collaborating on solutions for data architecture, cloud environments, and governance.
Senior Software Engineer developing backend services and data infrastructure for integrated products at Booz Allen. Collaborating with a small elite team to deliver reliable and scalable services.
AWS Streaming Data Engineer developing software and systems in a fast, agile environment. Utilizing experience with real - time data ingestion and processing systems across distributed environments.