Introduction
Data Wranglers can be viewed as a specialised type of Data Scientist, onboarded to a research project to guide understanding of multiple large complex datasets and construct optimised research workflows. It is commonly observed that data wrangling tasks take the majority of the time in research, in contrast to data analysis and modelling. RSF Data Wranglers thus play a crucial role in assembling data that is ‘research-ready’. As experts in data curation and quality control, they ensure that data and workflows are standardised, making it easier for other researchers to reuse data and research artefacts in extended analyses within the AIM Programme.
The significance of the work performed by RSF Data Wranglers extends beyond the confines of the AIM Programme. By making research data more accessible and usable, they contribute to the broader scientific community. Their efforts empower other researchers to build upon existing findings, perform extended analyses, and explore new avenues of inquiry. This collaborative and cumulative approach accelerates the progress of science and fosters innovation.
Activities (general approach):
It is challenging to efficiently access data, host data, wrangle and pre-process data, prior to conducting analyses for the research question of interest, specifically with large datasets and AI methods. In collaboration with Theme 1, Theme 2 facilitates conversations with and between researchers about infrastructure and software, collating common challenges and then providing knowledge and resources for potential solutions. Theme 2 promotes the sharing of datasets, analysis code, and workflow solutions. Furthermore, Theme 2 can offer solutions to research consortia by providing hands on data science support.