Data Curation Developer at GSK preparing high-quality data assets for R&D analysis through collaboration and technical expertise. Handle diverse datasets and ensure compliance with privacy and analysis standards in a hybrid work environment.
Responsibilities
Lead the development of business requirements for data curation through collaboration with R&D business and data platform teams
Maintain strong connections with analytical groups and R&D Data Platform teams to ensure seamless data integration and usage
Deliver pre-packaged, curated datasets aligned to business requirements for analytics
Document data specification that describes the required processing steps to generate analysis-ready datasets
Integrate diverse datasets (e.g., clinical trials, real-world data, omics) into a unified format for consistent analysis
Ensure all datasets meet analysis-ready and privacy requirements by performing necessary data curation activities
Provide coaching and peer review to ensure that the team’s work reflects industry best practices for data curation activities
Ensure that datasets are processed to meet conditions mentioned in the approved data re-use request
Write clean, readable code
Ensure that deliverables are appropriately quality controlled, documented, and can be handed over to R&D Tech team for production pipeline implementation
Requirements
BSc/MSc/PhD (or equivalent) in Computer Science, Mathematics, Statistics, or related subject
Proven experience of handling various modalities of scientific clinical data such as clinical trial data (including biomarkers), real world data (RWD), omics etc.
Experience in Python, Databricks, Delta Lake, PySpark, Pandas, other data engineering frameworks
Proven ability to handle and process large structured, semi-structured, and unstructured datasets efficiently
Strong communication skills and expertise to translate business needs into technical data requirements and processes
Ability to quantify and provide insights to business impact and value creation from data curation activities
Experience with at least one of the industry data standards such as CDISC(ODM: CDASH, SDTM, ADaM), HL7 FHIR, OMOP(CDM) etc.
Vice President of Engineering developing and implementing technology solutions in Wealth Management. Overseeing application development and leading management teams in a hybrid setting.
Avionics Technician II responsible for installing and repairing avionics systems for aerospace vehicles. Collaborating on high - stakes missions while ensuring reliability and performance in critical systems.
Fiori Developer creating and integrating SAP Fiori solutions at PwC Poland. Collaborating with teams on tailored applications and managing project timelines.
Senior Developer developing software solutions for Henry Schein in a hybrid work environment. Collaborating with teams to deliver high - quality applications and software maintenance.
Develop specialized software and systems for efficient management of Glory's deposit equipment. Ensure operational continuity of critical solutions through analysis, maintenance, and support.
Intern in Industrial Engineering and Lean Manufacturing supporting Continuous Improvement projects. Collaborating with engineering teams for innovative solutions in a leading medical technology company.
General Manager overseeing day - to - day operations at Shermco Industries. Responsibilities include staff management and process improvement in the electrical testing field.
C++ Developer with QT framework experience at Capgemini Engineering. Working on communication protocols, simulations, and requirement analysis among other duties.
Senior Director leading a team for AI Infrastructure and tech strategies at Dell Technologies in technical product leadership. Overseeing AI compute, storage, and engaging with design teams and executives.
CNC Programmer at Primetals Technologies creating optimal CNC programs and documentation with complex models. Collaborating with engineers to enhance product design and manufacturability.