Abstract
Similarity measures for text have historically been an important tool for solving information retrieval problems. In this paper we consider extended similarity metrics for documents and other objects embedded in graphs, facilitated via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. The suggested framework is evaluated for the task of disambiguating names in email documents. We show that reranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods.
Original language | English |
---|---|
Pages | 1-8 |
Number of pages | 8 |
State | Published - 2020 |
Externally published | Yes |
Event | 1st Workshop on Graph-Based Algorithms for Natural Language Processing, Textgraphs 2006 at Human Language Technologies - New York City, United States Duration: 9 Jun 2006 → … |
Conference
Conference | 1st Workshop on Graph-Based Algorithms for Natural Language Processing, Textgraphs 2006 at Human Language Technologies |
---|---|
Country/Territory | United States |
City | New York City |
Period | 9/06/06 → … |
Bibliographical note
Publisher Copyright:© 2006 Association for Computational Linguistics
ASJC Scopus subject areas
- Software