Introduction

Words mean different things depending on time, context, and people (think of the recent new social media meaning of 'tweet'), which represents a challenge with any unstructured datasets. Computational research has made great advances to find meaning change in language using textual data, but has not worked on ancient languages nor has engaged humanists so far, who can offer invaluable expertise in designing and validating these systems. This highly interdisciplinary project is the first one to focus on an ancient language and has taken the first steps towards building computational models of meaning change that engage classicists.

Explaining the science

Bayesian computational models of meaning change (which infer temporal meaning representations as probability distributions over words) have been proposed to find meaning change in texts (Frermann & Lapata 2016). This project incorporates expert-driven knowledge (specifically on the genre of texts) to further improve the models.

The choice of Ancient Greek has several reasons. Its scholarship provides excellent validation data (we know the outcomes) and external knowledge bases (for example of genre of texts). Its words have particularly many different meanings. There are top-quality transcribed texts (no need to correct OCR errors), which enables applications to born-digital texts. Greek has its own language family, unlike Latin (Romance) and English (Germanic); confounding factors from languages of the same family do not apply to Greek, making this a more controlled environment and an ideal testbed for applications to modern languages.

Project aims

  1. Build the first large-scale annotated corpus of Ancient Greek
  2. Develop Bayesian learning models of meaning change that use genre information.
  3. Annotate Ancient Greek texts with semantic information and use them to evaluate the computational models.
  4. Disseminate the results in natural language processing and digital humanities venues.

Applications

This work is very relevant to humanities scholarship for the investigation of word and concept change. It can also be applied in the context of historical semantic search of large historical text collections to make it possible for users to look for words with different meanings in different historical periods.

Recent updates

September 2019

  • Journal article: McGillivray, B., Hengchen, S., Lähteenoja, Palma, M., Vatri, A. (2019). A computational approach to lexical polysemy in Ancient Greek, Digital Scholarship in the Humanities doi.org/10.1093/llc/fqz036

August 2019

November 2018

September 2018

  • Talk: "A computational approach to semantic change in post-Classical Greek" at the workshop “Beyond Standards: Attic, the Koiné and Atticism”, University of Cambridge (Dr Barbara McGillivray and Dr Alessandro Vatri)

July 2018

  • Talk: "A computational approach to lexical polysemy in Ancient Greek" at the workshop "Computational methods for literary-historical textual scholarship", Leicester, UK (Dr Barbara McGillivray and Dr Alessandro Vatri)

June 2018

  • Project ended

December 2017

  • Project started

Organisers