Within the social sciences and the humanities the potential of massive datasets, such as social media data and cultural heritage collections, to study social and cultural phenomena is increasingly being recognized. These datasets provide the opportunity to study language use and behaviour in a variety of social situations on a large scale and often with the availability of detailed contextual information. However, to fully leverage their potential for research in the social sciences and the humanities, new computational approaches are needed.
This project focuses on developing and applying computational text analysis methods to shed light on social and cultural phenomena. While the field of natural language processing has mainly approached language as a means to convey information, people also use language to construct their identities, and to build and maintain social relationships.
This project focuses on modelling and analysing the social dimension of language using computational approaches, and combines natural language processing and machine learning methods with insights from sociolinguistics and the social sciences.