Abstract
We propose a workflow for retrieving place names from a corpus of Hebrew historical newspapers. We show that using an initial curated set of unambiguous toponyms and vector similarity is a productive method to populate a gazetteer with previously unknown variant names of known places, as well as names of places which are not yet included in the Gazetteer. We examine several parameters for enhancing accuracy and suggest a workflow that combines computation with human expertise, and is valuable to spatial history as well as to other domains.
Original language | English |
---|---|
Title of host publication | Proceedings of the 5th ACM SIGSPATIAL International Workshop on Geospatial Humanities, GeoHumanities 2021 |
Editors | Ludovic Moncla, Carmen Brando, Katherine McDonough |
Publisher | Association for Computing Machinery, Inc |
ISBN (Electronic) | 9781450391023 |
DOIs | |
State | Published - 2 Nov 2021 |
Event | 5th ACM SIGSPATIAL International Workshop on Geospatial Humanities, GeoHumanities 2021 - Beijing, China Duration: 2 Nov 2021 → … |
Publication series
Name | Proceedings of the 5th ACM SIGSPATIAL International Workshop on Geospatial Humanities, GeoHumanities 2021 |
---|
Conference
Conference | 5th ACM SIGSPATIAL International Workshop on Geospatial Humanities, GeoHumanities 2021 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 2/11/21 → … |
Bibliographical note
Publisher Copyright:© 2021 Owner/Author.
Keywords
- Gazetteers
- Historical Newspapers
- Natural Language Processing
- Toponym extraction
- Word Embeddings
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design
- Information Systems