MapReader: a computer vision pipeline for the semantic exploration of maps at scale

Abstract

We present MapReader, a free, open-source software library written in Python for analyzing large map collections. MapReader allows users with little computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and v) create structured data about map content. We demonstrate how MapReader enables historians to interpret a collection of ≈16K nineteenth-century maps of Britain (≈30.5M patches), foregrounding the challenge of translating visual markers into machine-readable data. We present a case study focusing on rail and buildings. We also show how the outputs from the MapReader pipeline can be linked to other, external datasets. We release ≈62K manually annotated patches used here for training and evaluating the models.

Citation information

Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, and Katherine McDonough. 2022. MapReader: a computer vision pipeline for the semantic exploration of maps at scale. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities '22). Association for Computing Machinery, New York, NY, USA, 8–19.

Turing affiliated authors