Abstract
Acronyms—words formed from the initial letters of a phrase—are important for various natural language processing applications, including information retrieval and machine translation. While hand-crafted acronym dictionaries exist, they are limited and require frequent updates. We present a new machine-learning-based approach to automatically build an acronym dictionary from unannotated texts. This is the first such technique that specifically handles non-local acronyms, i.e., that can determine an acronym’s expansion even when the expansion does not appear in the same document as the acronym. Our approach automatically enhances the dictionary with contextual information to help address the acronym disambiguation task (selecting the most appropriate expansion for a given acronym in context), outperforming dictionaries built using prior techniques. We apply the approach to Modern Hebrew, a language with a long tradition of using acronyms, in which the productive morphology and unique orthography adds to the complexity of the problem.
Original language | English |
---|---|
Pages (from-to) | 517-532 |
Number of pages | 16 |
Journal | Annals of Mathematics and Artificial Intelligence |
Volume | 88 |
Issue number | 5-6 |
DOIs | |
State | Published - 1 Jun 2020 |
Bibliographical note
Publisher Copyright:© 2018, Springer Nature Switzerland AG.
Keywords
- Acronyms
- Modern Hebrew
- Natural language processing
ASJC Scopus subject areas
- Artificial Intelligence
- Applied Mathematics