Within the social sciences and humanities, the potential use of massive datasets, such as social media data and cultural heritage collections, to study social and cultural phenomena is increasingly being recognised. These datasets provide the opportunity to study language use and behaviour in a variety of social situations on a large scale and often with the availability of detailed contextual information. However, to fully leverage their potential for research in the social sciences and humanities, new computational approaches are needed.
Developing and applying computational text analysis methods to shed light on social and cultural phenomena. While the field of natural language processing has mainly approached language as a means to convey information, people also use language to construct their identities, and to build and maintain social relationships.
This work involves modelling and analysing the social dimension of language using computational approaches, and combines natural language processing and machine learning methods with insights from sociolinguistics and the social sciences.