Past and current projects
Nowcasting positive test counts with reporting lag
Principal Investigators
Dr Radka Jersakova, The Alan Turing Institute
Dr James Lomax, National Cyber Security Centre and The Alan Turing Institute
Research team
Mark Briers, The Alan Turing Institute
James Hetherington, University College London
Chris Holmes, The Alan Turing Institute
Brieuc Lehmann, University College London
George Nicholson, University of Oxford
Overview
To monitor the current state of COVID-19, the UK government tracks the number of positive tests in each local authority. Since it takes time to process PCR swab tests, there is a delay of up to five days before all positive test results are reported.
The goal of this project is to 'nowcast' the number of daily positive test counts up to the present date. A 'nowcast' is a prediction informed by analysis of data currently available. Using statistical models, we can infer the expected final count using the incomplete data as it arrives. This estimate can be used to make up for the lag in data reporting and aid decision making.
Timeframe
Project completed
Outputs
Technical report pre-print now available – Bayesian imputation of COVID-19 positive test counts for nowcasting under reporting lag.
View the code on GitHub.
Spatial and temporal modelling of incidence and prevalence of COVID-19
Principal Investigator
Professor Marta Blangiardo, Imperial College London
Research team
Tullia Padellini, Imperial College London
Radka Jersakova, The Alan Turing Institute
Peter Diggle, Lancaster University
Chris Holmes, The Alan Turing Institute
Brieuc Lehmann, University College London
Ruairidh King, MRC Harwell
Ann-Marie Mallon, The Alan Turing Institute
George Nicholson, University of Oxford
Sylvia Richardson, University of Cambridge
Luis Santos, MRC Harwell
Overview
Our goal is to estimate the prevalence of COVID-19, combining several sources of data, accounting for their biases and uncertainty.
We aim at predicting the burden of the disease by integrating two different types of information on the number of cases: direct estimates (such as randomized surveys and testing programs) and indirect estimates (such as hospital admissions). We provide a flexible modelling framework, which is adjusted for known risk factors and accounts for spatial as well as temporal dependencies in our data.
Timeframe
Project completed
Outputs
Research paper:
Padellini, T., Jersakova, R., Diggle, P.J., Holmes, C., King, R.E., Lehmann, B.C., Mallon, A.M., Nicholson, G., Richardson, S. and Blangiardo, M., 2022. Time varying association between deprivation, ethnicity and SARS-CoV-2 infections in England: A population-based ecological study. The Lancet Regional Health-Europe, p.100322.
Blog
South Asians in poorer areas more at risk of catching COVID-19
Estimating COVID-19 prevalence and transmission from multiple sources: de-biasing Pillar 2 data
Principal Investigators
George Nicholson, University of Oxford
Brieuc Lehmann, University of Oxford
Research team
Tullia Padellini, Imperial College London
Koen Pouwels, University of Oxford
Radka Jersakova, The Alan Turing Institute
James Lomax, The Alan Turing Institute
Ruairidh King, MRC Harwell
Ann-Marie Mallon, The Alan Turing Institute
Luis Santos, MRC Harwell
Izzy Russell, MRC Harwell
Peter Diggle, Lancaster University
Sylvia Richardson, University of Cambridge
Marta Blangiardo, Imperial College London
Chris Holmes, The Alan Turing Institute
Overview
Our goal is to estimate COVID-19 prevalence and transmission rates at a fine-scale level, such as local authority, by harnessing data from multiple testing sources. We are designing a statistical model to adaptively adjust for biases and coherently combine information across multiple data streams.
Background
The daily or weekly number of positive COVID-19 tests in a region is widely used as a proxy for the local number of infected individuals.
Multiple testing sources
Positive test numbers arise from:
Randomized surveillance (REACT study, ONS survey).
Pillar 2 testing focused on testing symptomatic individuals.
Local mass testing at the level of cities, universities, care homes etc.
Testing bias
Tests results are subject to sampling and operational influences:
Ascertainment bias: symptomatic individuals are prioritized for testing, so the rate of positive tests is greater than the actual disease prevalence in the population.
False positive/negative test results: tests for COVID-19 infection, such as PCR and lateral flow, vary in sensitivity and specificity.
Weekday effects: the numbers of tests performed depends strongly on the day of the week.
Timeframe
Initial project now completed, and model now being used to apply to other project datasets.
Outputs
Research Paper
Nicholson, G., Lehmann, B., Padellini, T. et al. Improving local prevalence estimates of SARS-CoV-2 infections using a causal debiasing framework. Nat Microbiol 7, 97–107 (2022).
Blog
Why COVID-19 test data us skewed, and what we-re doing to fix it.
Detecting COVID-19 using biomedical acoustic markers
Principal Investigators
Steven Gilmour, Kings College London
Davide Pigoli, Kings College London
Stephen Roberts, University of Oxford
Bjoern Schuller, Imperial College London
Research team
Kieran Baker, Kings College London
Jobie Budd, University College London
Harry Coppock, Imperial College London
Chris Holmes, The Alan Turing Institute
Ivan Kiskin, University of Surrey
Vasiliki Koutra, Kings College London
George Nicholson, University of Oxford
Overview
The Biomedical acoustic markers project aims to develop a process to identify features in audio signals (voice and speech sounds), which are caused by Covid-19. This has the potential to be a fast and easily used early test paving the way for mass testing. It also has the potential, in due course, to be used for the early detection of other diseases.
Our work builds on early-stage research from Cambridge and other research groups, which reported how an algorithm (A sequence of rules that a computer uses to complete a task) could accurately identify COVID-19 positive patients that had no symptoms using audio recordings of coughs from a small test group.
To evaluate the possibility that COVID-19 results in unique features in individuals’ speech and airway sounds, we have collected a world leading respiratory sounds COVID-19 dataset. It is superior to previous datasets thanks to the number of recordings collected, richness of the metadata (information about the dataset) and quality of the ground truth labels (data that demonstrates COVID-19 status is the correct diagnosis).
On top of this we have carefully created two subsets of the dataset to evaluate the performance of the model. The first set, known as the training set, is what we let the model see and learn from. The second set, known as the test set, is how we evaluate the performance of the model. We have carefully created these partitions to address the bias in the dataset. Most importantly the test set is curated to feature matched pairs of individuals. These paired individuals have all the same characteristics, e.g. symptoms except for their covid status. Therefore, if we classify a pair correctly, we are more confident that this is due to true COVID-19 audio signals, rather than other symptoms.
If our study proves positive, the use of this algorithm released as a smartphone app has the potential as a rapid and affordable screening tool for COVID-19 and possibly other diseases.
Timeframe
From December 2020.
Using wastewater data to monitor COVID-19
Principal Investigator
Marta Blangiardo, Imperial College London
Research team
Peter Diggle, Lancaster University
Helen Duncan, The Alan Turing Institute
Philip Li, Northumbria University
Callum Mole, The Alan Turing Institute
George Nicholson, University of Oxford
Camila Rangel Smith, The Alan Turing Institute
Sylvia Richardson, University of Cambridge
Barry Rowlingson, Lancaster University
Fatemeh Torabi, University of Swansea
Overview
Infected people with COVID-19, with or without symptoms, shed the virus through their digestive systems or during daily activities, which ends up in wastewater. This process is called shedding. It is now known that as the number of COVID-19 patients in one area increases the amount of virus particles detected in wastewater (the viral load) also increases. The process of testing wastewater is done at wastewater plants as part of the regular testing of wastewater samples.
The Environmental Monitoring for Health Protection (EMHP) wastewater monitoring program led by the UK Health Security Agency, tests wastewater on a daily basis. This started in mid-2020 and carries on gathering data across 270 sites across England.
This project seeks to use these data to address research questions such as:
- How determining the frequency of disease using wastewater data at specific points in time can be used with more commonly used health monitoring data?
- Does wastewater data add value to monitoring diseases?
- And how can we best design wastewater sampling schemes for real-time monitoring, either using only wastewater data, or combined with traditional monitoring data in a cost-effective manner?
During the first phase of the project, the team will focus on the first of these research questions above and also work to identify priorities for future research. The data will first be explored and then the team will go on to conduct analysis related to different time periods and different places in the United Kingdom.
Timeframe
From November 2021
Investigating transmission of COVID-19 using mobility data
Principal Investigator
Yee Whye The, University of Oxford
Research Team
Helen Duncan, The Alan Turing Institute
Tor Erlend Fjelde, University of Cambridge
Hong Ge, University of Cambridge
Michael Hutchinson, University of Oxford
Radka Jersakova, The Alan Turing Institute
Callum Mole, The Alan Turing Institute
George Nicholson, University of Oxford
Camila Rangel Smith, The Alan Turing Institute
Sylvia Richardson, University of Cambridge
Overview
This project aims to improve our understanding of how people’s movement affects the spread of COVID-19 virus. This work has the potential to provide insight for policy makers on, for example, the likely impact of travelling outside of a person's local area on controlling the spread of the virus.
We will create a high-quality infectious disease transmission model (A model is a framework to show the relationship between variables in a dataset) that uses real time mobility data. This work builds on a space and time model (the Epimap model), previously developed by our team, which produces local estimates of transmission. The model includes consideration of other factors such as population density and data that captures social and economic deprivation, vaccination coverage and information on variants.
The work from this project, such as the model, will be open source (accessible to all) to generate discussion and increase the transparency of this work for greater future reusability by other researchers and for greater access for policy makers.
Timeframe
From August 2021.
Investigating COVID-19 transmission using gene sequencing (Genomics+)
Principal Investigator
Ewan Birney, European Bioinformatics Institute
Research Team
Tom Fitzgerald, European Bioinformatics Institute
Kumar Gaurav, European Bioinformatics Institute
Chris Holmes, The Alan Turing Institute
Timeframe
From January 2022