In this paper we present a methodology based on distributional semantic models that can be flexibly adapted to the specific challenges posed by historical texts and that allow users to retrieve semantically rel- evant text without the need to close-read the documents. We focus on a case study concerned with detecting smell-related sen- tences in historical medical reports. We demonstrate a process for moving from generic domain label input to a more nu- anced evaluation of the semantics of smell in a set of sentences extracted from this corpus, and then develop a machine learn- ing technique for compounding scores on a variety of modelling parameters into more effective classifications
McGregor, S. and McGillivray, B. (2018). A Distributional Semantic Methodology for Enhanced Search in Historical Records: A Case Study on Smell. Proceedings of the 14th Conference on Natural Language Processing (KONVENS 2018) Vienna, Austria, September 19-21, 2018. Austrian Academy of Sciences Press.