The recently proposed GP Data Sharing scheme in England, originally due to come into practice this week, represents one of the biggest shakeups in health data usage in NHS history. Also known as the General Practice Data for Planning and Research (GPDPR) collection, it sets out improvements and a new framework for creating a central NHS digital database from GP records in England.

The NHS was wise to pause the rollout of the GPDPR and the collection of patient data that was initially set for this week (1 September), “to provide more time to engage with GPs, patients, health charities and others, and to strengthen the plan.”

In August, it was reported that over a million people have already opted out, due to uncertainties brought about by a perceived lack of communication and clarity about the scheme. The new framework will serve to improve our ability to draw critical, life-saving insights from health data, as long as the process is subject to all the safeguards described.

There are clear public benefits from careful, ethical sharing of patient data provided there are strict guarantees on its fair use. Through careful analysis of patient data, AI and data science have the ability to improve the detection, diagnosis, and treatment of illness. These techniques will optimise the provision of services, and support health service providers to anticipate demand and deliver improved patient care.  

But to realise the ambitions to improve our healthcare systems and for beneficial impacts to be felt, several key principles and elements need to be in place (at least):

  • Ethical consideration and independent governance, involving patients and the public, on data use
  • Clearly stated principals for access to data, including transparency of its use
  • The ability for patients to easily opt out at any point
  • Public and patient involvement in all aspects of analysis
  • Putting patients’ benefit first

In addition, even with these safeguards, there are uncertainties that need to be taken into account, including how organisations accessing the data will demonstrate compliance to the individuals whose data is being accessed. It has also been argued by some Turing ethics researchers that data safety regulations, such as ISO 27001 and other data safe haven certifications, should be published and subject to random audit. Also, all organisations having such access should offer transparent and fair recompense in the event of data breaches.

Some of my colleagues at the Turing also argue that subjects should have to opt in to the scheme, rather than opt out which is the current proposed format. Further to this, patients should also be aware of who is accessing their data, and for what use. Consideration should be given to the potential for patients to block access from specific users they object to on the grounds of ethics or privacy, for example. 

Every data point is a person – bringing people along the journey

It’s easy to understand and empathise with concerns, particularly around what will happen in future to a patient’s donated data: for instance who will be able to access it and for what purposes? It’s important to monitor and control for certain sectors of society having  a greater propensity to opt out, which means you may get underrepresentation of those groups and then non-representative data that can bias algorithms, potentially exacerbating existing inequalities in health care. Besides good public engagement around data, one way to combat this is through open-source analytics. By opening up the process of data analysis, you have a greater chance of noticing and correcting for bias in the data, and less risk of overlooking 'spurious' correlations.

One of the Turing’s recent projects, in collaboration with the Bingham Centre and BIICL and partner organisations, has employed citizens’ juries to explore the role of governance and law in building public trust around the GPDPR and other kinds of critical data-driven technologies to serve public health. It lays out ‘practical guidance on how international and national institutions can build public trust in the processes by which they design and implement data-driven responses to public health emergencies’, and considers legal, human rights, policy and technical perspectives.

A way forward

The positive benefits of drawing insights and applying learnings from public data stem far beyond just health into other realms such as finance, defence and security, and more.  But the importance of trust and transparency underlines all of this work. The way in which patients do or don’t decide to engage with this new GP data sharing scheme will define what is possible for the next twenty years and beyond. We look forward to the design and careful application of data science that will benefit all walks of society.


Chris Holmes is Director of The Alan Turing Institute’s Health and Medical Sciences programme, which works with a variety of stakeholders across the NHS (such as NHSx, DHSC, and more) and also recently partnered with Roche to generate insights into disease, patient, and outcome heterogeneity using advanced analytics.