Data Engineer responsible for creating and maintaining data pipelines for machine learning models at a financial services company. Requires proficiency in modern database technologies and programming skills.
Responsibilities
Support the deployment of machine learning models to production environments, supplying models with data stored in a warehouse or coming directly from sources, configuring data attributes, managing compute resources, and setting up monitoring tools (AWS CloudWatch, AWS EventBridge);
Manage stored data and structure it appropriately using database management systems;
Monitor the overall performance and stability of automated data pipelines and make necessary modifications according to requirements;
Collect and integrate data from different environments from structured and unstructured sources; serve as a technical point of reference;
Track the results of the team's action plans, ensuring value-driven and high-quality deliveries aligned with organizational objectives;
Build and maintain data catalogs;
Perform code reviews of ETL code.
Requirements
Proficient knowledge of leading database models and technologies (OLTP, OLAP, Data Lake, Data Warehouse, and Big Data);
Programming skills to develop, customize, and manage integration tools, databases, warehouses, and analytical systems (AWS DMS, AWS Glue, AWS Lambda, Step Functions, Glue Crawler, Athena, Redshift, S3, DynamoDB, Delta Tables);
Knowledge of file formats such as JSON, PARQUET, AVRO, ORC, etc.;
Experience with data cataloging tools such as Amundsen and AWS Glue Catalog;
Experience with ETL code pipelines (CI/CD);
Extensive experience working as a Data Engineer.
Benefits
CAJU Flexible Benefits Card - Food/Meal/Mobility vouchers and more.
Health insurance.
Dental plan (Bradesco).
Life insurance.
Study incentive programs.
English course - English Pass (partnership available to all DM employees).
Wellhub for exercising.
DM VISA card with a pre-approved limit.
Flexible working hours.
Private pension.
ShortDay - log off earlier so you can take care of yourself!
No dress code – what matters is that you feel comfortable.
Day off on your birthday - a gift from DM to you!
PPR (Profit Sharing Program).
For parents: daycare assistance for children up to 6 years old.
For parents: extended maternity and paternity leave — 180 days for mothers and 20 days for fathers.
Intermediate Data Engineer designing and building data pipelines for travel industry data management. Collaborating across teams to ensure reliable data for analytics and reporting.
Data Engineer managing and organizing datasets for AI models at Walaris, developing AI - driven autonomous systems for defense and security applications.
Data Engineer designing and maintaining data pipelines at Black Semiconductor. Collaborating with process, equipment, and IT teams to support manufacturing analytics and decision - making.
Junior Data Engineer role focusing on Business Intelligence and Big Data at Avanade. Collaborating on data analysis and SQL queries in a supportive learning environment.
GCP Data Engineer designing and developing data processing modules for Ki, an algorithmic insurance carrier. Working closely with multiple teams to optimize data pipelines and reporting.
Data Engineer at Securian Financial optimizing scalable data pipelines for AI and advanced analytics. Collaborating with teams to deliver secure and accessible data solutions.
IT Data Engineering Co‑Op at BlueRock Therapeutics supports development of scientific data systems. Collaboration on data workflows and foundational AWS data engineering tasks.
Data Engineer I building and operationalizing complex data solutions for Travelers' analytics using Databricks. Collaborating within teams to educate end users and support data governance.
Data Engineer shaping modern data architecture to drive golf’s digital transformation. Collaborating with teams to enhance data pipelines and insights for customer engagement and revenue growth.
Staff Data Engineer overseeing complex data systems for CITY Furniture. Responsible for architecting and optimizing data ecosystems in a hybrid work environment.