Senior Data Engineer building scalable data pipelines at Monstro, an innovative fintech company. Shape the future of data architecture in a data-intensive environment.
Responsibilities
Build and own scalable pipelines that parse and normalize unstructured sources for retrieval, knowledge graphs, and agents.
Conceive and implement novel processes for processing thousands of types of unstructured documents with accuracy and consistency.
Process semi structured sources into consistent, validated schemas.
Transform structured datasets for analytics, features, and retrieval workloads.
Create, version, and maintain multiple collections in a vector database.
Manage embeddings, metadata, and lifecycle, and tune chunking and filters for relevance and latency.
Design and implement robust multi-modal document processing systems that handle heterogeneous file formats (PDFs, images, HTML, XML).
Own ingestion from APIs, file drops, partner feeds, and scheduled jobs with monitoring, retries, and alerting.
Implement data quality checks for schema, ranges, and nulls, and document lineage and SLAs.
Stand up and harden object, relational, document, and vector stores with the right indexing and partitioning.
Build reusable libraries and services for parsing, enrichment, and embedding generation.
Handle sensitive financial and personal data with access controls, auditing, and retention policies.
Partner with product and engineering to ship features that depend on reliable data.
Document standards, coach teammates, and contribute to future hiring.
Requirements
Minimum 2 years in a dedicated Data Engineering role at an AI-native startup or 4+ years of experience in traditional Data Engineering, with ~8+ years of experience in Tech overall.
Proven ownership of end-to-end pipelines (ingestion → transformation → serving), including scalable sourcing processes, ETL pipelines, and serving services.
Experience owning and operating infrastructure in production environments.
Strong Python and SQL.
Hands on document parsing and ETL across PDFs, HTML, JSON, and XML.
Experience operating vector databases such as pgvector, Pinecone, or Weaviate, with multiple collections.
Building and scheduling ingestion via APIs, web downloads, and cron or an orchestrator, plus cloud storage and queues.
Understanding of embeddings, chunking strategies, metadata design, and retrieval evaluation.
Solid data modeling, schema design, indexing, and performance tuning across storage types.
History of implementing data quality checks, observability, and access controls for sensitive data.
Track record of delivering high-consistency systems for mission-critical data pipelines.
Ownership mindset, clear written communication, and effective collaboration with product and engineering.
Data Engineer engineering DUAL Personal Lines’ strategic data platforms for global insurance group. Providing technical expertise in data engineering and collaborating with internal teams for solution delivery.
Data Engineer role focused on creating and monitoring data pipelines in an innovative energy company. Collaborate with IT and departments to ensure quality data availability in a hybrid work environment.
SQL Migration Data Engineer at Auxo Solutions focusing on Azure SQL/Fabric Lakehouse migrations and building data pipelines. Collaborating on technical designs and data governance for modernization initiatives.
Data Engineer developing cloud solutions and software tools on Microsoft Azure big data platform. Collaborating with various teams for data analysis and visualization in healthcare.
Boomi Integration Architect designing and leading integration solutions for data warehouses. Collaborating with cross - functional teams to implement scalable integration patterns using Boomi technologies.
Seeking a Boomi Integration Architect specializing in Data Warehouse and Master Data Hub implementations. Responsible for designing high - performance integration solutions across enterprise platforms.
Principal Data Engineer at Serko enhancing global travel tech through data - driven solutions. Collaborating across teams in Bengaluru to drive innovative engineering and best practices.
Data Engineer at Keyrus responsible for building and optimizing data pipelines for major projects. Contributing to data solutions and ensuring data quality in a growing team.