In normal times, economic growth is positively skewed, and in recessions negative skew emerges, which is know as vulnerability. This project looks at whether it's possible to use sentiment analysis of the words used in business news to help make real-time probabilistic forecasts about the UK’s real economic growth. The forecasts will help assess the risk of vulnerability in the UK’s economic activity over the business cycle.
This project received funding from the Turing-HSBC-ONS Economic Data Science Awards 2018.
Explaining the science
Given that the novel contribution of this project lies in assessing vulnerability to economic growth, this project deploys methods developed by previous researchers in the burgeoning computational linguistics literature.
Unsupervised machine learning is used to cluster economic news 'topics' without intervention. The 'cleaned' data are decomposed into news topics using 'latent Dirichlet allocation' (LDA), a statistical natural language processing model, which effectively defines each news report as a mixture of topics. A time series of news sentiment is constructed from the proportion of total news (defined by the number of topics) assigned to each topic each day. Dictionaries are used to designate positive and negative sentiment, and then to construct a daily measure of net sentiment for each topic.
Semi-supervised machine learning is explored too, categorising articles by positive and negative sentiment, 'training' the model on a subset of the articles. In the final stage of the process, a 'dynamic factor model' is deployed to construct an aggregate news sentiment indicator.
Given the considerable uncertainty in fitting the model (e.g. the degree of cleaning, the number of news topics, assessing positive and negative sentiment, dictionary of choice, and so on), multiple sentiment indicators derived from varying these assumptions is constructed, with model averaging used to provide more robust predictions.
The aim of the project is to assess whether news sentiment indicators based on business news can help make real-time probabilistic assessments about the UK’s real economic growth.
The researchers have arranged access to Thomson-Reuters' news archive to aid in the assessments being made.
The interdisciplinary research blends methods from macroeconomics, applied statistics, machine learning (automated text recognition), and applied big data methods (handling news flows).
The forecasts generated in this work, which will be updated at high frequency using real-time news flows, will produce 'fan charts' of growth. The Bank of England produces fan charts but these are currently not linked formally to news sentiment, therefore the fan charts produced in this work could be of benefit to the Bank of England, as well as other financial institutions and international public sector and government organisations.
The methodology should allow researchers based in policy-making institutions to produce more accurate probabilistic assessments of extreme macroeconomic events. If central banks produce more accurate risk assessments and communicate more accurately with the public, there is scope for better matching of monetary policy to macroeconomic conditions in practice.