How data science can help measure innovation in the economy

Thursday 21 Nov 2019

Filed under

Introduction

From the steam engine to the emergence of AI, innovation and new ideas have always been instrumental in transforming economies.

One of the key ambitions of the Turing’s Finance and Economics Programme is to build a digital twin of the economy by mapping it in real time, measuring the flow of goods and services and assessing the development of trading relationships to understand the economy at scale.

Achieving this is closely connected to our ability to understand the development and diffusion of emerging technologies across various sectors of the economy. This information can also help policymakers make informed decisions regarding the mix of skills, resources, funding and infrastructures needed.

Yet existing industrial taxonomies don’t often allow for measuring innovation in real time. Additionally, new sectors such as AI or immersive technologies don’t fit in the pre-existing codes.  With the rise of data science, policymakers have access to novel data-driven methods to supplement traditional innovation indicators, including detailed and timely analytics, access to open datasets and novel visualisation methods. These new methods come with a set of challenges that preclude their wide-spread adoption.

Turing teams up with Nesta to deliver an Innovation Mapping Hackweek

To help support the development of the community of researchers and Research and Innovation (R&I) policymakers, The Turing Finance and Economics programme partnered with Nesta Innovation Mapping team to deliver HackSTIR, a Hack Week focused on applying novel data-driven methods to the problem of innovation mapping.

The week was hosted by the innovation foundation Nesta on 21-25 October in partnership with the Turing, with support from Intellectual Property Office and SAGE. For the participants, HackSTIR became an immersive environment for learning, knowledge exchange and rapid project prototyping.

Over the course of the week, the participating teams looked at a broad spectrum of challenges, including mapping trends in digital social innovation using social media data, predicting success of research funding proposals, identifying policy themes from raw text and trying to understand what differentiates successful proposals from unsuccessful ones. The datasets included social media (Twitter), policy initiative databases, and funding proposals data.

Nesta led several tutorials on machine learning, Python data analysis, and web scraping; the Intellectual Property Office offered a patents tutorial; and in addition three Turing researchers delivered tutorials throughout the week.

Andrew Elliott, Research Associate, spoke about using network science to uncover important entities, highlighting the importance of selecting the correct tool for the dataset in question; Kirstie Whitaker, Research Fellow, led the discussion on the importance of reproducibility (slides here and background on the Turing Way reproducibility project here), and Elena Kochkina, Doctoral Student, delivered the tutorial on natural language processing to automatically analyse human-written texts.

 

Learnings and future outlook

Several themes emerged across the participating teams, who identified challenges related to the quality, representativity and interpretability of innovation related datasets. Text-based interconnected data, such as social media, can be noisy and unreliable and reducing the data sample to be representative (not skewed or biased), analysing it and communicating it in a clear manner can be challenging.

The second aspect of the interpretability challenge relates to explaining the meaning of complex analytical outputs. When applying analytics on research and innovation data, there is also the challenge of text classification and topic labelling, for example different scientific fields may refer to similar projects differently.

Finally, the ethics and governance questions are as pertinent as ever when it comes to innovation policy, including the questions of how to avoid the problems of bias and distortion, black-box algorithmic outputs, and unreproducible results.  

The UK is already a global leader in innovation, ranked fifth in the World Intellectual Property Office’s Global Innovation Index 2019, with the stated goal of being “the most innovative country in the world.”

The Turing recognises that a data-driven understanding of techno-scientific outputs, collaborations, trajectories, geographies and skills can transform the way innovation policy in done. Our understanding of text in business websites, patent offices and social media platforms can be enhanced through the use of real time data about software development (e.g. GitHub), networking (e.g. Meetup), or academic publications (e.g. arXiv).

These data sources, combined with increased analytical capabilities and insights from complexity science, can help policymakers identify gaps and opportunities in innovation systems and make predictions about future technological trajectories. They also pose challenges and risks—but we are optimistic that these will open up new opportunities for the international research and innovation community to thrive.

The Finance and Economics Programme at the Turing is already funding several projects focused on understanding the changing trends in the labour market and measuring innovation activity in the economy, including "Labour supply in the gig economy", "Network modelling of the UK's urban skill base", "Predicting economic growth from business news.”

All workshop materials and Innovation Mapping tutorials are available here. For future HackSTIR events, please contact George Richardson at Nesta.