Tracking abuse on Twitter against football players in the 2021-22 Premier League Season

Abstract

Online abuse against prominent sportspeople, such as football players, is a growing concern. To help understand this issue, we have launched a new project analysing tweets directed at Premier League Footballers with an account on Twitter. The analysis was run over a period of 165 days (~ 5 months), from the start of the 2021/2022 season (13 August 2021) to the winter break (24 January 2022). We did not analyse online abuse in the Women’s Super League, the highest league of women’s football in England. The dynamics and patterns of abuse experienced by women players require their own interrogation in dedicated research.

Twitter is the focus of this report for three reasons. First, Twitter is a large and widely-used platform, and many Premier League football players are active on it. Second, several players have reported being abused on Twitter before, such as during the Euro 2020 finals, which makes it relevant for this research. Third, unlike most platforms, Twitter makes data available for academic research via its free to use API, making this type of analysis possible. This research is not intended as a reflection or commentary on Twitter’s trust and safety practices. We did not investigate who saw each tweet, how many times they were viewed, how long abusive posts stayed online or what safety measures were applied by the platform.

This report is quali-quantitative in nature, comprising manual review of 3,000 tweets by experts; creation of a new machine learning tool that can automatically assess whether tweets are abusive; and large-scale data analysis of 2.3 million tweets. This report was commissioned by Ofcom as part of the Online Harms Observatory, a new analytics platform from The Alan Turing Institute generously supported by the Department for Digital, Culture, Media and Sport (DCMS).

  1. The majority of tweets we qualitatively analysed are Positive. Of 3,000 randomly sampled tweets that we qualitatively analysed, 55% are Positive towards players, 27% are Neutral, 12.5% are Critical and 3.5% are Abusive. 
  2. Our qualitative and quantitative results give different estimates of the prevalence of abuse. 3.5% of the qualitatively analysed random sample of 3,000 tweets are Abusive, compared with 2.6% of the 2.3 million tweets we analysed with machine learning.
  3. The percentage of content which is Abusive is low. Of the 2.3 million tweets we analysed with our machine learning tool for detecting abuse, 2.6% contain abuse (n = 59,871). This is still a large number in total, which creates a serious risk of harm to the players.
  4. Identity attacks comprise a small percentage of all abuse. Only 8.6% of Abusive tweets, or 0.2% of all tweets (n = 5,148) contain a reference to the player’s identity (i.e. a protected characteristic, such as religion, race, gender and sexuality).
  5. The majority of players received abuse at least once. 68% of players received at least one Abusive tweet during the period (418/618). One in fourteen (7%) received abuse every day.
  6. Abuse varies over time, with peaks following key events. In particular, on two days, there were substantial increases in both the total number and percentage of tweets which are Abusive. For instance, on 7th November 2021, when Harry Maguire sent a tweet about Manchester United’s performance, 10.6% of tweets were Abusive (n = 2,903).
  7. A small proportion of players receive the majority of abuse. For instance, 12 players account for 50% of all Abusive tweets. Cristiano Ronaldo and Harry Maguire receive the largest number of Abusive tweets.
  8. Many users send just one Abusive tweet. Of 44,907 users who sent at least one Abusive tweet, 82.3% sent only one Abusive tweet. The other 17.7% sent more than one Abusive tweet, accounting for 35% of all abuse (n = 7,948).

If you have questions about this report or would like more information about The Alan Turing Institute’s research, reach out to Pica Johansson ([email protected]).

The Alan Turing Institute’s public policy programme

The public policy programme works alongside policy makers to explore how data-driven public service provision and policy innovation might solve long running policy problems and to develop the ethical foundations for the use of data science and artificial intelligence in policy-making. Our aim is to contribute to the Institute's mission – to make great leaps in data science and artificial intelligence research in order to change the world for the better – by developing research, tools, and techniques that have a positive impact on the lives of as many people as possible.

The Online Harms Observatory

The Online Harms Observatory is a new analytics platform from The Alan Turing Institute’s public policy programme. It combines large-scale data analysis and cutting-edge machine learning developed at the Turing to provide real-time insight into the scope, prevalence and dynamics of harmful content online. It aims to help policymakers, regulators, security services and civil society stakeholders better understand the landscape of online harms. Initially, it will focus on online hate, personal attacks, extremism and misinformation. The Observatory is supported by the Department for Digital, Culture, Media and Sport (DCMS).

Funding

This report was commissioned by Ofcom, in relation to its upcoming role as the UK’s Online Safety regulator. It is one output of a larger Turing project utilising the new Online Harms Observatory. In-kind support was given by The Alan Turing Institute.

Turing affiliated authors

Research areas