A is for algorithm: Why we created a data science and AI glossary

The thinking behind the Turing’s new glossary

Monday 27 Sep 2021

This week, we’ve added a glossary to The Alan Turing Institute’s website, with 24 definitions of key terms in data science and artificial intelligence (AI), including algorithmic bias, digital twin, neural network and synthetic data.

There is a lot of jargon in data science and AI, so our aim is to create an accessible resource for non-specialists who want to find out more about these topics without having to navigate the technical language. We’re hoping that we can counter some of the misinformation and lead the conversation around these topics, and provide some clarity to the terms that people hear in everyday life – algorithm, deepfake, robot – while also introducing them to new concepts, like deep learning, natural language processing or the Turing test. We’re also hoping that it will be a useful resource for journalists and policy makers, as well as researchers in areas that intersect with data science and AI.

The genesis of this project was a Twitter campaign that we ran in February and March 2021, where we tweeted a definition of a word, phrase or name over 26 days, creating an A-Z of data science and AI. Encouraged by the warm reaction to the campaign, we set about creating a more permanent glossary on our website, drawing up a new list of terms with a slightly more technical flavour (no space for Eric the robot in our glossary, alas!).

I wrote the definitions in collaboration with several researchers at the Turing (big thanks to James Geddes, Adrian Weller and Mhairi Aitken), and it’s fair to say that this wasn’t always an easy process! Crystallising complex concepts in a limited number of sentences took a few iterations, with the biggest challenge being keeping the definitions clear and accessible without compromising on accuracy, and without introducing a load of jargon into the definition itself. I found ‘beginner’s mind’ to be a useful concept when writing the definitions – I tried to approach the topics as if it was the first time that I’d encountered them.

We’ve arrived at 24 definitions that we are happy with – for now! We’re aware that the world of data science and AI is ever-changing, and some of these definitions will need updating, so we’ll be regularly reviewing the glossary and adding new terms to the list. We are keen to refine the definitions and learn from our varied audiences, so if you spot something that we can improve, please get in touch!

To accompany the glossary’s launch, we have also released two other related pieces of work:

‘What is an algorithm?’ A short animated guide to the science of algorithms 

The Turing Podcast: How to communicate science to non-specialists 
The Turing’s Ethics Research Fellow Mhairi Aitken and Science Writer James Lloyd discuss why we need science communicators in the first place, what makes for good communication, and what specific challenges are associated with communicating data science and AI research to the general public.