Interesting pattern mining in multi-relational data

Eirini Spyropoulou, Tijl De Bie, Mario Boley

Research output: Contribution to journalArticlepeer-review

Abstract

Mining patterns from multi-relational data is a problem attracting increasing interest within the data mining community. Traditional data mining approaches are typically developed for single-table databases, and are not directly applicable to multi-relational data. Nevertheless, multi-relational data is a more truthful and therefore often also a more powerful representation of reality. Mining patterns of a suitably expressive syntax directly from this representation, is thus a research problem of great importance. In this paper we introduce a novel approach to mining patterns in multi-relational data. We propose a new syntax for multi-relational patterns as complete connected subsets of database entities. We show how this pattern syntax is generally applicable to multi-relational data, while it reduces to well-known tiles " Geerts et al. (Proceedings of Discovery Science, pp 278-289, 2004)" when the data is a simple binary or attribute-value table. We propose RMiner, a simple yet practically efficient divide and conquer algorithm to mine such patterns which is an instantiation of an algorithmic framework for efficiently enumerating all fixed points of a suitable closure operator "Boley et al. (Theor Comput Sci 411(3):691-700, 2010)". We show how the interestingness of patterns of the proposed syntax can conveniently be quantified using a general framework for quantifying subjective interestingness of patterns "De Bie (Data Min Knowl Discov 23(3):407-446, 2011b)". Finally, we illustrate the usefulness and the general applicability of our approach by discussing results on real-world and synthetic databases.

Original languageEnglish
Pages (from-to)808-849
Number of pages42
JournalData Mining and Knowledge Discovery
Volume28
Issue number3
DOIs
StatePublished - May 2014
Externally publishedYes

Keywords

  • Interestingness measures
  • K-partite graphs
  • Maximum entropy modelling
  • Multi-relational data mining
  • Pattern mining

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Interesting pattern mining in multi-relational data'. Together they form a unique fingerprint.

Cite this