Data Engineer II managing the data ingestion efforts at SimplePractice, supporting the data-driven journey and analytics across the enterprise.
Responsibilities
Partner with analysts to build scalable systems that help unlock the value of data from a wide range of sources such as backend databases, event streams, and marketing platforms
Consult with our Product and Engineering Teams in the creation of new data in the production environment
Create company wide alignment through standardized metrics across the company
Promote importance of dimensional data models in communicating across the organization
Manage the complete data stack from ingestion through data consumption
Connect our teams and their workflows to centralized and secure databases
Build tools to increase transparency in reporting company wide business outcomes
Define and promote data engineering best practices
Design scalable data solutions leveraging cloud data technologies, preferably in AWS
Help define data quality and data security framework to measure and monitor data quality across the enterprise
Excellent problem-solving & critical-thinking skills to meet complex data challenges and requirements in a fast paced, rapidly changing environment
Requirements
4+ years of progressive professional experience preferred
BS/MS degree in Engineering, Mathematics, Physics, Computer Science or equivalent experience
Excellent communication skills
Top-notch SQL, statistical/window functions, complex data types
Expert in relational technology, data modeling, and in dimensional modeling
Expert in at least two database engines, preferably MySQL, Snowflake, or Postgres
Metadata-driven and database-centric concepts
Database performance
Expert at ETL and ETL tools, including Airflow/Prefect, DBT, Airbyte, Fivetran
ELT and schema-on-read concepts
Data ingestion tools, such as Kafka, DMS, Singer
At least one programming language, preferably Python
Unix/Linux scripting, such as bash
Experience with APIs, such as via curl
Experience with achieving performance through parallelism
DAGs
Experience with cloud-based infrastructure, particularly AWS
Cloud storage, S3
Data storage formats, such as Parquet, ORC
Experience with external tables
Unstructured and semi-structured data types, JSON
Experience with at least one visualization tool, preferably, Looker, Tableau, Sisense
Benefits
Privatized Medical, Dental & Vision Coverage
Work From Home stipend
Flexible Time Off (FTO), wellbeing days, paid holidays, and Summer Fridays
Monthly Meal Reimbursement
Holiday Bonus, 15-day Aguinaldo
Hybrid Work Schedule & Catered Lunch
A relocation bonus for candidates joining us from a different city
Data Warehouse Modelling Engineer designing and maintaining data models using Data Vault 2.0 for iGaming industry. Collaborating with stakeholders and optimizing data models in a hybrid work environment.
Senior Data Engineer driving impactful data solutions for the climate logistics startup HIVED's core data platform. Collaborating with cross - functional squads to enhance analytics and delivery.
Data Engineer developing and maintaining CRE forecasting infrastructure for Cushman & Wakefield. Collaborates with senior economists and technical teams to ensure high - quality data solutions.
Data Engineer at PwC, engaging with Azure cloud services to enhance data handling and integrity. Responsibilities include pipeline optimizations, documentation, and collaboration with stakeholders.
Data Engineer Manager at PwC focusing on building data infrastructure and solutions. Leading data engineering projects to transform raw data into actionable insights and drive business growth.
Junior Data Engineer at OneMarketData focusing on data quality and integrity in financial datasets. Collaborating with senior analysts and assisting in data management and analysis tasks.
Senior Data Engineering Analyst developing and implementing data solutions. Collaborating in a diverse environment focused on data processing and analysis for clients' digital transformation.
Principal Software Engineer in Threat Data Platform developing AI - driven tools for threat intelligence automation. Collaborating on robust data pipelines for PANW’s product ecosystem.