The Diorisis Ancient Greek Corpus


This corpus was created in the context of the project "Computational models of meaning change in Ancient Greek"

led by Barbara McGillivray


Related data set “Diorisis Ancient Greek Corpus” with DOI in repository Figshare. The Diorisis Ancient Greek Corpus is a digital collection of ancient Greek texts (from Homer to the early fifth century AD) compiled for linguistic analyses, and specifically with the purpose of developing a computational model of semantic change in Ancient Greek. The corpus consists of 820 texts sourced from open access digital libraries. The texts have been automatically enriched with morphological information for each word. The automatic assignment of words to the correct dictionary entry (lemmatization) has been disambiguated with the implementation of a part-of-speech tagger (a computer programme that may select the part of speech to which an ambiguous word belongs).

Citation information

Vatri, A., & McGillivray, B. (2018). The Diorisis Ancient Greek Corpus, Research Data Journal for the Humanities and Social Sciences3(1), 55-65. doi:

Turing affiliated authors

Research areas