In spite of the surging interest in multiword expressions (MWEs) in recent years, it is still unclear how such expressions should be stored in computational lexicons. This problem is amplified in morphologically-complex languages, where the unique properties of MWEs interact with non-trivial morphological processes. We propose an architecture for lexical representation of MWEs, augmented by a protocol for integrating MWEs into a morphological processing system. The proposal is applied to Modern Hebrew, a Semitic language with complex morphology and a problematic orthography. The result is an integrated system that can morphologically process Hebrew multiword expressions of various types. In light of the complexity of Hebrew morphology and orthography, we are confident that the proposed architecture is general enough so as to accommodate MWEs in a large number of languages.
Bibliographical noteFunding Information:
This research was supported by THE ISRAEL SCIENCE FOUNDATION (grants No. 137/06, 1269/07). We are grateful to the anonymous IJL reviewers for very constructive comments that greatly improved this paper. All remaining errors are of course our own.
ASJC Scopus subject areas
- Language and Linguistics