Infrastructure-Focused Data Engineer for NVIDIA’s Data & Observability Platform. Developing data pipelines and managing Data Lakehouse for massive-scale operations.
Responsibilities
Build Scalable Data Pipelines: Develop and deploy high-throughput, reliable pipelines to move substantial volumes of telemetry information from global edge locations to our central Data Lakehouse.
Architect the Data Lakehouse: Lead the implementation of our tiered storage strategy. You will design efficient schemas that optimize for both write-heavy real-time ingestion and fast, cost-effective interactive queries.
Orchestration & Automation: Modernize workflow scheduling by implementing robust, code-based data pipelines. You will build workflows that handle complex dependencies, automated backfills, and intelligent retries.
Drive Embedded Data Optimization: Partner directly with internal engineering teams to audit their data usage. You will identify heavy-hitter datasets and primary storage consumers, refactor inefficient schemas, and enforce lifecycle policies to significantly reduce storage costs.
Manage Data Infrastructure: Own the operation of the underlying platform. You will manage stateful deployments on Kubernetes, optimize Spark performance, and ensure the reliability of our streaming architecture.
Enforce Quality & Governance: Implement automated schema validation and data quality checks to prevent bad data from entering the lake. You will collaborate with security teams to apply automated masking and access controls.
Requirements
BS or MS in Computer Science, Electrical Engineering, or related field (or equivalent experience).
8+ years of experience in Data Engineering with a strong focus on Infrastructure, Streaming, or Platform building.
Strong Coding Fluency: Expert proficiency in Python for automation, tooling, and orchestration.
Proficiency in Java or Scala for high-performance data processing (Spark/Flink).
Deep Streaming Expertise: Extensive experience with Kafka.
Data Lake Experience: Hands-on experience with modern table formats (Apache Iceberg, Delta Lake, or Hudi) and distributed query engines (Trino/Presto/Spark).
Containerization & Ops: Deploy, configure, and debug applications on Kubernetes using Helm.
Benefits
Equity and benefits
Job title
Senior System Software Engineer – Data Engineering
Senior Cloud Data Engineer maintaining RealTruck’s data warehouse for analytics and reporting. Collaborating globally and mentoring junior engineers while ensuring data accessibility and quality.
Data Architect designing solutions for hospital patient monitoring systems at Philips. Focused on leveraging performance data and customer feedback to drive technology differentiation and customer success.
Data Engineer responsible for constructing and maintaining data pipelines at Evertec, ensuring reliable data for business decisions in a financial technology context.
Data Engineer working on innovative software solutions aimed at enhancing operational efficiency across sectors. Join a passionate team at one of the leading software companies in Brazil.
Senior Data Architect leading data architecture and evolution for federal programs at SteerBridge. Responsible for high - performance data infrastructures and mentoring a team of data engineers.
Data Architect leading enterprise architecture strategy and implementing solutions in data management. Collaborating with stakeholders and teams for integration and data quality standards implementation.
Senior Data Engineer designing and optimizing modern data architectures in a challenging financial IT project. Working remotely with an agile team in Germany, focusing on scalable data platforms.
Principal Data Engineer at Trainline shaping robust data foundations for AI and ML - driven products. Collaborate with cross - functional teams to ensure best practices in data engineering and ML.
Data Migration Consultant responsible for migrating various data types into Microsoft Dynamics 365 CE. Analyze structures, develop mapping documentation, and ensure data integrity during migration.
Senior Consultant focusing on data migrations within Master Data Management for Scheer group. Engaging in project leadership, consulting, and strategy development for data migration projects.