Your primary role is to transform raw data ingested by the Big Data team and make it available in an automated and documented manner for data analysts and scientists within our Databricks ecosystem.
Identifying and transforming a large volume of data using operations performed in our Databricks environment to make it readily usable by data analysts and scientists.
Designing, developing and maintaining robust and resilient data transformation workflows in Databricks (Lakeflow/Workflows).
Optimizing query and processing performance to ensure low latency and high resource efficiency.
Managing the ongoing maintenance and evolution of workflows based on user needs and conducting impact analysis for upstream changes in the data structure.
Working closely with the metrics team (responsible for data collection), the Big Data team (to understand the ingestion process) and the team of data analysts/scientists (to understand business needs).
Implementing and maintaining data access governance via Unity Catalog (permissions, sensitive data masking, lineage tracking).
Implementing and maintaining a data quality control process.
Providing technical support to data scientists for setting up MLOps pipelines (training data preparation, automated feature engineering) and incorporating them into production flows.
Establishing and documenting data engineering and modelling best practices within the team.
Paying special attention to legal compliance, security and privacy considerations at all stages of data transformation and sharing.
Requirements
Three or more years’ experience in a similar role.
Graduate degree in a relevant field (business intelligence, economics, mathematics, statistics, IT, etc.).
Strong proficiency with Python and Spark; Scala an asset.
Knowledge of Databricks: Lakeflow, workflow development, performance optimization and metadata management via Unity Catalog.
Experience with orchestration tools (e.g., Airflow, Databricks Workflows).
Knowledge of other Big Data technologies and cloud platforms (Azure, GCP or AWS) an asset.
Proficiency in versioning tools such as Git.
Passion for digital and emerging technologies.
Familiarity with Radio-Canada platforms and content an asset.
Benefits
Flexible work schedules, allowing you to prioritize yourself, your family and your work.
Work-from-home opportunities.
Competitive total rewards package.
Opportunities to work with cutting-edge technology.
Opportunities for continued learning and professional development.
Opportunities to become a member of our Employee Resource Groups.
Pair programming and mentorship opportunities, where you can learn from the best in the industry and help coach new talent.
A creative and dynamic work environment, where your ideas and contributions can be heard, valued and respected.
A supportive management team committed to upholding the highest standards of diversity and inclusivity.
An environment that favours experimentation and an iterative approach in order to achieve the highest form of technical innovation.
Job title
Data Scientist, Digital Development – French Services
Senior Product Analyst collaborating with teams to develop innovative products at Agência Estado. Key responsibilities include data analysis, market research, and facilitating product management processes.
Senior/Staff Data Scientist developing AI for commerce in the Middle East. Architecting systems for merchant and customer AI assistants and content generation.
Data Scientist leveraging statistical methods and machine learning techniques at FUCHS. Focus on data analysis, modeling, and collaboration for data - driven solutions.
Data Science Intern leveraging AI and ML technologies for product development at Seagate. Hands - on experience with data analysis, model development, and actionable insights generation.
Analyst within Credit Risk Management team identifying credit segmentation opportunities using statistical methods. Collaborating with teams to enhance credit decision process and policies.
Data Manager managing and analyzing company data at Amoddex, a consultancy for IT transformation projects. Ensuring data integrity and supporting strategic decision - making in a collaborative environment.
Data Scientist at Capital One on the LLM Customization Team utilizing the latest in computing and machine learning technologies. Collaborating with data scientists and engineers to deliver AI powered products.
Lead Full Stack Data Scientist at Tilt, building the intelligence layer for data - based decisions. Driving data science strategy and analytics to enhance product and growth insights.
Data Scientist focusing on Generative AI applications and engineering problem - solving at Ford. Collaborating with cross - functional teams to innovate and improve technology solutions in the automotive sector.
AI Engineer/Data Scientist in Ford's Global Data Insights & Analytics team. Developing advanced AI/ML solutions and collaborating on cloud - native data products.