Language resources for Hebrew

Alon Itai, Shuly Wintner

Research output: Contribution to journalArticlepeer-review


We describe a suite of standards, resources and tools for computational encoding and processing of Modern Hebrew texts. These include an array of XML schemas for representing linguistic resources; a variety of text corpora, raw, automatically processed and manually annotated; lexical databases, including a broad-coverage monolingual lexicon, a bilingual dictionary and a WordNet; and morphological processors which can analyze, generate and disambiguate Hebrew word forms. The resources are developed under centralized supervision, so that they are compatible with each other. They are freely available and many of them have already been used for several applications, both academic and industrial.

Original languageEnglish
Pages (from-to)75-98
Number of pages24
JournalLanguage Resources and Evaluation
Issue number1
StatePublished - Mar 2008

Bibliographical note

Funding Information:
Acknowledgments This work was funded by the Israeli Ministry of Science and Technology. Parts of this project were supported by THE ISRAEL SCIENCE FOUNDATION (grant No. 137/06); by the Israel Internet Association; and by the Caesarea Rothschild Institute for Interdisciplinary Application of Computer Science at the University of Haifa. Several people were involved in this work, and we are extremely grateful to all of them: Meni Adler, Roy Bar-Haim, Dalia Bojan, Ido Dagan, Michael Elhadad, Nomi Guthmann, Adi Milea, Noam Ordan, Erel Segal, Danny Shacham, Shira Schwartz, Yoad Winter, and Shlomo Yona. We are grateful to the reviewers for useful comments.


  • Corpora
  • Hebrew
  • Language resources
  • Lexicon
  • Morphological processing
  • WordNet

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Linguistics and Language
  • Library and Information Sciences


Dive into the research topics of 'Language resources for Hebrew'. Together they form a unique fingerprint.

Cite this