Event extraction using structured learning and rich domain knowledge: Application across domains and data sources

Research output: Contribution to journalArticlepeer-review


We consider the task of record extraction from text documents, where the goal is to automatically populate the fields of target relations, such as scientific seminars or corporate acquisition events. There are various inferences involved in the record-extraction process, including mention detection, unification, and field assignments. We use structured learning to find the appropriate field-value assignments. Unlike previous works, the proposed approach generates feature-rich models that enable the modeling of domain semantics and structural coherence at all levels and across fields. Given labeled examples, such an approach can, for instance, learn likely event durations and the fact that start times should come before end times. While the inference space is large, effective learning is achieved using a perceptron-style method and simple, greedy beam decoding. A main focus of this article is on practical aspects involved in implementing the proposed framework for real-world applications. We argue and demonstrate that this approach is favorable in conditions of data shift, a real-world setting in which models learned using a limited set of labeled examples are applied to examples drawn from a different data distribution. Much of the framework's robustness is attributed to the modeling of domain knowledge. We describe design and implementation details for the case study of seminar event extraction from email announcements, and discuss design adaptations across different domains and text genres.

Original languageEnglish
Article number16
JournalACM Transactions on Intelligent Systems and Technology
Issue number2
StatePublished - 1 Dec 2015

Bibliographical note

Publisher Copyright:
© 2015 ACM.


  • Beam search
  • Domain knowledge
  • Information extraction
  • Perceptron
  • Structured learning
  • Template filling

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Artificial Intelligence


Dive into the research topics of 'Event extraction using structured learning and rich domain knowledge: Application across domains and data sources'. Together they form a unique fingerprint.

Cite this