A new report from The Alan Turing Institute, ‘Data science and AI in the age of COVID-19,’ draws insights from around 100 experts, across a wide range of disciplines including ethicists, clinicians, mathematicians, policy advisors, and many more to capture the experiences and key contributions of the UK’s data science and AI community response during the first wave of the COVID-19 pandemic. Following the Turing’s public COVID-19 conference and a series of workshops in late 2020, the report captures a snapshot of the community’s contributions, as well as the initiatives and resources that were developed during the pandemic.

It finds that across the scientific spectrum, scientists and researchers of all backgrounds have stepped up, the data science and AI community included, to work within or alongside clinicians, policymakers and government at the heart of the response. But there are also substantial challenges and barriers that prevent data science and AI from being used to their full potential. The report highlights a need for:

  • Greater robustness and timeliness of data: Workshop participants reported difficulties accessing sufficiently timely, robust and granular data, as well issues with issues of data standardisation and documentation. Some participants also reported problems of privilege, with some researchers able to access data much more easily than others (e.g. non-health specialists).

    An example is given by the simple question “Was the patient ventilated?”. While seemingly straightforward and intuitive, in some cases it was impossible for clinicians to answer the question with a simple “yes” or “no” due to the complex treatment regime of many COVID-19 patients.
     
  • Need for greater equality and inclusion: In the report, the participants highlighted concerns about inadequate representation of minority groups in data and low engagement with these groups, as well as concerns about inequalities in the ease with which researchers could access data, and about a lack of diversity within the research community and decision-making organisations. 

    In the scramble to use existing datasets, or rapidly create new ones, there became a risk of proceeding without due regard for sampling biases (such as insufficient representation of minority groups being ‘baked into’ datasets due to systemic discrimination, structural inequalities and data collecting constraints), which could in turn lead to biased research and policies that exacerbate pre-existing inequalities. Traditionally underrepresented groups also tend to be scarce in digital data – especially those on the ‘invisible’ side of the digital divide, who might not have access to the internet or a computer.
     
  • Transparent communications: While there have been excellent examples of science communication throughout the pandemic, participants highlighted the challenges of communicating research findings and uncertainties to policymakers and the public in a timely, accurate and clear manner. They also highlighted challenges with identifying which studies had ‘cut through’ and been considered by government and expert advisory groups when making policy interventions. 

    And while it is difficult to do, participants said it is critical to communicate transparently with other researchers, policymakers and the public, particularly around issues of modelling and uncertainty. Better communication was cited as being required to avoid the dangers of data and research being misused or misinterpreted. Trust and acceptance of data-science based interventions might also be increased by communicating what data was used, and how modelling methods were applied to inform those policies. 

Workshop participants have made a number of suggestions for how the data science and AI community might address these challenges. Some of these include improving access to cleaned and anonymised data; more equitable data access; protocols for data standardisation; increased representation of, and engagement with, minority groups; training for researchers in communicating their work to non-specialists; and initiatives to increase public understanding of research findings and uncertainty, to better counter misinformation. 
 
Recently, the Scientific Academies of the Group of Seven group led by the Royal Society also published a call for more ‘data readiness’ in preparation for future health emergencies ahead of the G7 Summit (11-13 June). This is a timely amplification of the message contained in this report about the need for better data access and sharing across the board, in order to be able to better respond to impacts of new COVID-19 variants, as well as the vaccine and policy impact at local, regional, national or international levels.

Michael Wooldridge, Programme Co-Director for AI at The Alan Turing Institute and report editor said:

‘As the national institute for data science and AI, we convened these workshops with the intention of capturing a snapshot of the data science and AI community’s voices and experiences in the pandemic response. We hope the reflections it contains will provide useful preliminary insights for policymakers, researchers and public health bodies alike to consider as we continue to manage the impact of the COVID-19 pandemic.

“If the community can make progress in these areas of timely access to quality data; equality and inclusivity; and transparent communications, then when we are next faced with a pandemic – and the historical record strongly suggests that this is a ‘when’ rather than an ‘if’ – we should be better placed as a collective to respond.”

The Turing would like to thank all of the workshop participants for their time and insights, as well as the four report editors: Inken von Borzyskowski (Assistant Professor of Political Science, UCL), Bilal Mateen (Clinical Data Science Fellow, The Alan Turing Institute; Clinical Technology Lead, Wellcome Trust), Anjali Mazumder (AI and Justice & Human Rights Theme Lead, The Alan Turing Institute), and Michael Wooldridge (Turing Fellow and Programme Co-Director for Artificial Intelligence, The Alan Turing Institute). 
 

Cover image courtesy of William Perugini via Shutterstock