Dr Mallon will lead a research data team (data wrangling team) at the Turing with a focus on delivering research ready data in a strategic and systematic manner. The provisioning of data is often the single largest hurdle for most analytical projects and the aim of the team is to work with programmes at the Turing to remove this burden for analysts and statisticians and to provide tools, standards and processes for integrating and provisioning high quality data. The readiness of data for research can encompasses many barriers and bottlenecks, and the team are keen to build partnerships and identify projects which are dealing with these challenges.
Dr Mallon became a programme leader at the MRC Harwell Institute in 2010, leading a data science group which focused on the analysis of data from large-scale functional genomics projects. As part of this role, Dr Mallon lead the NIH funded IMPC (International Mouse Phenotyping Consortium) Data Coordination Centre for the last 10 years, focusing on the capture, quality control and integration of multi-dimensional data from over 10 international centres. The IMPC is an international effort to identify the function of every protein-coding gene in the mouse genome and to illuminate the function of the large proportion of the mammalian genome that remains undefined. The underpinning role of the DCC was vital to the success of this project ensuring the data aggregated from multiple geographical locations and studies was standardised and quality controlled to enable reproducible and robust data analysis. To date the project has generated data for over 9k genes and a dataset of over 74 million data points. The impact is reflected in a number of high profile and seminal publications [Nature (Dickinson, M.E. et.al., 2016), Nature Genetics (Meehan, T.F., et.al., 2017), Nature Comms (x3) (Bowl, M.R., et.al., 2017; Karp, N.A., et.al., 2017 & Rozman, J., et. al., 2018). In addition, Dr Mallon, has led the data provisioning theme for a five-year collaboration between the Big Data Institute (BDI) in Oxford and Novartis, capturing and delivering research ready data for two key disease domains. This dataset on Multiple Sclerosis (MS) and IL-17 (Cosentyx clinical trials) has aggregated clinical and imaging data on over 50,000 patients. Over the last 18 months, Dr Mallon has expanded her research portfolio in collaboration with the Health Programme at the Turing Institute working on 3 key projects (UKHSA-RSS-Turing Lab; EDON and the NIHR AIM RSF) with a focus on data readiness, data provisioning, data quality and downstream reproducibility.