The development of new and effective drugs, health services and policies requires close cooperation between clinical researchers, the pharmaceutical industry and patients. For this reason, regulators worldwide, including the US Food and Drug Administration and the European Medicines Agency, are pushing for patients and their experience to be put at the centre of clinical research.

This increasing focus on the patient voice has led to a greater interest in online health forums (OHFs) – public message boards such as HealthBoards and Inspire that have been quietly chronicling the experiences, anxieties, suffering and resilience of millions of patients worldwide over the past 20 years. Often run by charities and patient advocacy groups, OHFs allow patients, and at times healthcare providers, to post about health conditions and concerns in search of advice, practical information and emotional support.

The millions of OHF threads and posts that chart personal experience of disease contain a wealth of information on interaction with health systems, paths to diagnosis, comorbidities, patient concerns, and side effects and emerging off-label uses of medications (i.e. uses that haven’t been approved by regulators). In our work at the Turing, in cooperation with colleagues at Queen Mary University of London (QMUL), Barts Cancer Institute and King’s College London, we have applied a blend of machine learning techniques to OHF data in order to map the impact of disease on patients’ daily lives.

Technically, OHF data presents a multi-layered challenge. One aspect of our work uses natural language processing techniques to mine individual posts, with the aim of interpreting the meaning of words and sentences automatically. Secondly, the complex ways in which users interact and spread information (or at times, unfortunately, misinformation) on a forum can be tracked using network analysis techniques that trace the chain of replies across threads, highlighting user communities. In this way, for example, we can identify valuable support groups and patient blogs. Finally, time series analysis of forum posts can help quantify major trends such as the impact of the COVID-19 pandemic, and also yield longitudinal insight into patients’ individual journeys, such as time to diagnosis for rare diseases.

A specific aim of our analysis is to complement or replace traditional ‘concept elicitation’ studies, which use structured interviews with patients to tease out information about their experiences – a time-intensive process. In one recent study, we analysed OHF data to find out how physical activity is affected for patients suffering from chronic heart failure, chronic obstructive pulmonary disease or fibromyalgia. Identifying the aspects of physical activity most affected by these long-term conditions, whether that’s e.g. sleep, exercise, getting out of bed or doing the housework, will help to inform the use of wearable activity monitors (such as smartwatches and Fitbit-like devices) in clinical trials. That’s because, in order to provide useful insights into these conditions, researchers need to collect activity data that is most relevant to the patients’ real-world experiences.

In another study, recently published in the prestigious BMJ Open journal, we looked at OHF conversations from the start of the pandemic, when OHFs became, for many, the go-to resource for learning about COVID-19 and its implication for other diseases and treatments. Our analysis of over 700,000 posts from January to May 2020 revealed evolving concerns about symptoms and comorbidities of COVID-19, and a growing number of posts about anxiety and other mental health conditions. A summary of our findings is available via this dashboard, which allows detailed exploration of those first few months of the pandemic through the eyes of OHF users.

Our analyses made use of technology developed by Mebomine, a QMUL spin-out that we launched in 2019. With our technology, we hope to extract useful information that can benefit clinical research and quality of treatment, while sidestepping the delays, expense and bureaucratic hurdles of the standard interview-based approach to concept elicitation. Even where such interviews may in the end be necessary, insights from OHFs can help fine-tune questionnaires prior to interviews, thus ensuring that valuable patient time and experience are used in the best possible way.

Read the paper:
Analysis of mental and physical disorders associated with COVID-19 in online health forums: a natural language processing study

 

Top image: Alan Warburton / BBC / Better Images of AI