Data Engineer II building pilot datasets and production-grade data platforms at GeoComply. Collaborating with product teams to deliver data-driven features for geolocation compliance.
Responsibilities
Build Pilot Datasets: Rapidly design and develop experimental data models and datasets to support pilot product features and validate hypotheses.
Bridge Business & Data: Collaborate closely with product managers to translate functional requirements into initial data schemas and logic.
Ad-Hoc to Self-Serve: Construct initial logic for ad-hoc data requests and evolve them into standardized, self-serve tools for the product team.
Productionize Pipelines: Take successful pilot datasets and transform them into robust, production-grade data pipelines. Refactor "pilot code" to follow best practices, ensuring high performance and data quality in the production environment.
Foundation Development: Build and maintain the internal libraries, services (Airflow), and Databricks jobs required to run these datasets at scale.
Project Management: Manage the lifecycle of data products from "epic-size" concepts through to delivery and maintenance.
Stakeholder Communication: Effectively communicate technical constraints and data insights to stakeholders during the transition from pilot to production.
Product Ideation: Actively contribute ideas on how data can drive new product benefits during weekly team calls.
Requirements
Four (4) years of relevant experience, with a focus on bridging database technology with product requirements.
You possess strong product thinking skills—the ability to analyze requirements, identify customer pain points, and build data solutions that contribute directly to the product vision.
Strong skills in Databricks, Spark/PySpark, and SparkSQL to manipulate heavy datasets during both the discovery and production phases.
Extensive experience with MySQL (for operational data) and NoSQL, with an understanding of how to model data for analytics vs. applications.
Proven ability to operate on epic-size projects, specifically managing the timeline from "proof of concept" to "delivered feature".
Strong interpersonal skills, adept at explaining complex data concepts to non-technical product stakeholders.
Familiarity with Git, Linux environments, and the Prom/Loki Stack for monitoring data health
Data Engineer designing, building, and maintaining scalable data platforms and pipelines at Kyndryl. Utilizes Azure cloud solutions and adheres to modern software development practices.
Junior Data Engineer responsible for developing and maintaining software programs and scripts under guidance. Collaborating with a software engineer to ensure compliance with policies and security standards.
Data Engineer developing scalable data pipelines and ETL processes at Walmart. Collaborating with cross - functional teams to ensure seamless data integration and quality.
Software Developer Specialist contributing to data processing solutions in big data framework at Verafin. Collaborating with cross - functional teams to ensure data accuracy and pipeline reliability in a flexible work environment.
Principal Data Engineer designing, building, and maintaining data pipelines for finance analytics at Northrop Grumman. Collaborating with engineers and finance analysts to ensure data accuracy and availability.
Senior Data Engineer responsible for migrating and modernising data platforms in banking. Rebuilding critical data platform with a focus on risk and core financial data flows.
Data Engineering Lead managing enterprise - scale data platforms using AWS, Snowflake, and Databricks in financial services. Leading data engineering teams and ensuring data governance.
AWS Data Engineer working in Gurugram to support data architecture and integration solutions. Collaborating and translating business needs into data models.
Senior Data Engineer handling data engineering responsibilities in hybrid setting for banking industry. Collaborating with cross - functional teams and maintaining data quality in Azure environments.
Data Management professional at Kyndryl involved in creating innovative data solutions and ensuring the seamless operation of complex data systems. Collaborating with teams to transform requirements into scalable database solutions.