Hybrid Senior Data Engineer

Posted 3 months ago

Apply now

About the role

  • Senior Data Engineer building scalable data pipelines at Monstro, an innovative fintech company. Shape the future of data architecture in a data-intensive environment.

Responsibilities

  • Build and own scalable pipelines that parse and normalize unstructured sources for retrieval, knowledge graphs, and agents.
  • Conceive and implement novel processes for processing thousands of types of unstructured documents with accuracy and consistency.
  • Process semi structured sources into consistent, validated schemas.
  • Transform structured datasets for analytics, features, and retrieval workloads.
  • Create, version, and maintain multiple collections in a vector database.
  • Manage embeddings, metadata, and lifecycle, and tune chunking and filters for relevance and latency.
  • Design and implement robust multi-modal document processing systems that handle heterogeneous file formats (PDFs, images, HTML, XML).
  • Own ingestion from APIs, file drops, partner feeds, and scheduled jobs with monitoring, retries, and alerting.
  • Implement data quality checks for schema, ranges, and nulls, and document lineage and SLAs.
  • Stand up and harden object, relational, document, and vector stores with the right indexing and partitioning.
  • Build reusable libraries and services for parsing, enrichment, and embedding generation.
  • Handle sensitive financial and personal data with access controls, auditing, and retention policies.
  • Partner with product and engineering to ship features that depend on reliable data.
  • Document standards, coach teammates, and contribute to future hiring.

Requirements

  • Minimum 2 years in a dedicated Data Engineering role at an AI-native startup or 4+ years of experience in traditional Data Engineering, with ~8+ years of experience in Tech overall.
  • Proven ownership of end-to-end pipelines (ingestion → transformation → serving), including scalable sourcing processes, ETL pipelines, and serving services.
  • Experience owning and operating infrastructure in production environments.
  • Strong Python and SQL.
  • Hands on document parsing and ETL across PDFs, HTML, JSON, and XML.
  • Experience operating vector databases such as pgvector, Pinecone, or Weaviate, with multiple collections.
  • Building and scheduling ingestion via APIs, web downloads, and cron or an orchestrator, plus cloud storage and queues.
  • Understanding of embeddings, chunking strategies, metadata design, and retrieval evaluation.
  • Solid data modeling, schema design, indexing, and performance tuning across storage types.
  • History of implementing data quality checks, observability, and access controls for sensitive data.
  • Track record of delivering high-consistency systems for mission-critical data pipelines.
  • Ownership mindset, clear written communication, and effective collaboration with product and engineering.

Benefits

  • Equity
  • Flexible work arrangements

Job title

Senior Data Engineer

Job type

Experience level

Senior

Salary

$160,000 - $200,000 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job