Researchers at The Alan Turing Institute have conducted the first systematic study of Urban Dictionary, the informal, crowd-sourced online dictionary best known for slang and niche definitions.
In a paper published today in the Royal Society journal Open Science, the Turing’s Dong Nguyen, Barbara McGillivray and Taha Yasseri attempt to characterise Urban Dictionary’s content, including how opinionated and offensive its entries are. They study a complete snapshot of the website from its inception as a parody of Dictionary.com in 1999, through to 2016.
The promise of the ‘wisdom of the crowd’ has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. Yet the decentralized and often un-monitored environment of such projects leave them susceptible to low-quality content, edit wars and destructive interactions between users. It involves a community that up- and down-votes entries based on whether the voter thinks the entry is offensive, informative, funny and whether the voter agrees or disagrees with the expressed view.
In a time where ‘facts’ are hotly contested items on the internet, Urban Dictionary is an unapologetic affront to highly referenced and cross-examined material. Most dictionaries strive towards objective content. For example, Wiktionary states ‘Avoid bias. Entries should be written from a neutral point of view, representing all usages fairly and sympathetically’. In contrast, the entries provided in Urban Dictionary do not always describe the meaning of a word, but they sometimes contain an opinion (e.g. beer ‘Possibly the best thing ever to be invented ever. I MEAN IT.’ or Bush ‘A disgrace to America’).
But what can Urban Dictionary teach us about the reality of our language, biases, and how we actually speak day-to-day? This latest analysis uses natural language processing to shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content:
- emo, love and god are the words in Urban Dictionary with the most definitions
- definitions stating an opinion are prevalent and tend to be ranked as more offensive
- Urban Dictionary includes many unfamiliar words, words not found to be appropriate in formal settings and also proper nouns
Language is constantly evolving. Over time, new words enter the lexicon, others become obsolete, and existing words acquire new meanings. The authors conclude that while Urban Dictionary captures many infrequent, informal words and it also contains offensive content, highly offensive deﬁnitions tend to get ranked lower through the voting system.
Press and Communications Manager
The Alan Turing Institute
firstname.lastname@example.org 020 3862 3390
Notes to Editors
- Dong Nguyen is a Research Fellow at The Alan Turing Institute and is also affiliated with Edinburgh University.
- Barbara McGillivray is a Research Fellow at The Alan Turing Institute and the University of Cambridge.
- Taha Yasseri is a Fellow at The Alan Turing Institute and Senior Research Fellow in Computational Social Science at the Oxford Internet Institute, University of Oxford.