Abstract
Scholars in the Humanities often strive for depth and rigor in a single field or a small number of interconnected linguistic, cultural, and historical domains. This specialization and expertise, however, come at the cost of bodies of knowledge becoming isolated from each other. Digital tools that facilitate linking resources from disparate languages, a process known as multi-lingual entity resolution, can break down linguistic barriers between scholastic islands and enhance fertile cross-cultural collaboration. However, many contemporary multi-lingual tools and underlying technological resources are developed or trained on web data, that is, modern text contributed by Web users over the past few decades. Unsurprisingly, therefore, these tools’ efficacy has been limited when applied to text written in historical languages and dialects. To address these issues, in this paper, we present MEHDIE, the Middle East Heritage Data Integration Endeavor, a project dedicated to facilitating the semantic integration of knowledge sources related to the history of the Middle East using innovative data integration methods. The prototype multi-lingual entity resolution system we are currently developing utilizes spatial (geo-coordinates) and cross-lingual matching of textual information about a pair of place records to help identify their semantic relationship. This unique ability allows researchers to create bridges between collections of historical places from different languages, sources, and cultures. We present the motivation for MEHDIE, the tool’s current state, and some of the challenges we intend to address in the future.
| Original language | English |
|---|---|
| Pages (from-to) | i238-i246 |
| Journal | Digital Scholarship in the Humanities |
| Volume | 41 |
| Issue number | Special Issue: ‘Digital Humanities 2023: Collaboration as Opp... |
| DOIs | |
| State | Published - Apr 2026 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025. Published by Oxford University Press on behalf of EADH.
Keywords
- data integration
- entity resolution
- Middle East
- multilingual
- toponym
- transliteration
ASJC Scopus subject areas
- Information Systems
- Language and Linguistics
- Linguistics and Language
- Computer Science Applications
Fingerprint
Dive into the research topics of 'MEHDIE: the Middle East Heritage Data Integration Endeavor'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver