Detecting and understanding harmful content online

Developing benchmarks and datasets for online harms researchers, and guidance for practitioners using tools to detect online harms


This project aims to systematise research in online harms (e.g. research on hate speech). It will do this by developing lists of datasets and benchmarks to compare different attempts to solve the problem (e.g. benchmarks to compare different hate speech classifiers), and developing guidelines for practitioners who wish to use the outputs of online harms research (e.g. government or policy experts who want to develop a quantitative understanding of hate speech in a certain context).

Project aims

The project will produce three outcomes: 

  1. Datasets or lists of datasets related to online harms
  2. A deeper understanding of how different tools for detecting and understanding online harms (e.g. hate speech classifiers) work. 
  3. Guidance for practitioners who wish to use the outcomes of recent research into online harms. Such guidance may include, for example, how context should be taken into account when deciding between two different tools.

In summary, this project aims to create 'meta tools' - tools and best practices for using or combining existing tools for detecting and understanding online harms.                

Recent updates

March 2020 is a website which collects together several known datasets related to hate speech online. This is expected to serve as a resource for researchers in the area.


Researchers and collaborators