Abstract
While there has been much research on automatically constructing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language definitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Automatic Gloss Finding, i.e., assigning glosses to entities in an initially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints. To the best of our knowledge, GLOFIN is the first system for this task. Through extensive experiments on real-world datasets, we demonstrate GLOFIN's effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFIN's robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have made the datasets and code used in this paper publicly available.
Original language | English |
---|---|
Title of host publication | WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining |
Publisher | Association for Computing Machinery |
Pages | 369-378 |
Number of pages | 10 |
ISBN (Electronic) | 9781450333177 |
DOIs | |
State | Published - 2 Feb 2015 |
Event | 8th ACM International Conference on Web Search and Data Mining, WSDM 2015 - Shanghai, China Duration: 31 Jan 2015 → 6 Feb 2015 |
Publication series
Name | WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining |
---|
Conference
Conference | 8th ACM International Conference on Web Search and Data Mining, WSDM 2015 |
---|---|
Country/Territory | China |
City | Shanghai |
Period | 31/01/15 → 6/02/15 |
Bibliographical note
Publisher Copyright:Copyright © 2015 ACM.
Keywords
- Gloss finding
- Hierarchical learning
- Web mining.
ASJC Scopus subject areas
- Computer Networks and Communications