Acronyms—words formed from the initial letters of a phrase—are important for various natural language processing applications, including information retrieval and machine translation. While hand-crafted acronym dictionaries exist, they are limited and require frequent updates. We present a new machine-learning-based approach to automatically build an acronym dictionary from unannotated texts. This is the first such technique that specifically handles non-local acronyms, i.e., that can determine an acronym’s expansion even when the expansion does not appear in the same document as the acronym. Our approach automatically enhances the dictionary with contextual information to help address the acronym disambiguation task (selecting the most appropriate expansion for a given acronym in context), outperforming dictionaries built using prior techniques. We apply the approach to Modern Hebrew, a language with a long tradition of using acronyms, in which the productive morphology and unique orthography adds to the complexity of the problem.
|Number of pages||16|
|Journal||Annals of Mathematics and Artificial Intelligence|
|State||Published - 1 Jun 2020|
Bibliographical noteFunding Information:
The authors are grateful to Ran El-Yaniv, Doug Freud, Assaf Glazer, Shie Mannor, and Shaul Markovitz for their machine learning advice. We thank Rafi Cohen for his help with LDA, Nachum Dershowitz for his historical acronym guidance, Chaim Kutnicki for his efficient coding support, Tomer Ashur and Sela Ferdman for their pre-processing of the Wikipedia corpus, and Josh Wortman for his dictionary assistance. Statistically significant improvements to our math were provided by Nicholas Mader, Breanna Miller, Tony Rieser, Zach Seeskin, and Brandon Willard. Thanks to acronym annotators Yosi Atia, Hannah Fadida, Limor Leibovich, Lior Leibovich, Shachar Maidenbaum, Elisheva Rotman, and Beny Shlevich. This research was supported by THE ISRAEL SCIENCE FOUNDATION (grant No. 1269/07).
© 2018, Springer Nature Switzerland AG.
- Modern Hebrew
- Natural language processing
ASJC Scopus subject areas
- Artificial Intelligence
- Applied Mathematics