Providing concise database covers instantly by recursive tile sampling

Sandy Moens, Mario Boley, Bart Goethals

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Known pattern discovery algorithms for finding tilings (covers of 0/1-databases consisting of 1-rectangles) cannot be integrated in instant and interactive KD tools, because they do not satisfy at least one of two key requirements: a) to provide results within a short response time of only a few seconds and b) to return a concise set of patterns with only a few elements that nevertheless covers a large fraction of the input database. In this paper we present a novel randomized algorithm that works well under these requirements. It is based on the recursive application of a simple tile sample procedure that can be implemented efficiently using rejection sampling. While, as we analyse, the theoretical solution distribution can be weak in the worst case, the approach performs very well in practice and outperforms previous sampling as well as deterministic algorithms.

Original languageEnglish
Title of host publicationDiscovery Science - 17th International Conference, DS 2014, Proceedings
EditorsSašo Džeroski, Panče Panov, Dragi Kocev, Ljupčo Todorovski
PublisherSpringer Verlag
Pages216-227
Number of pages12
ISBN (Electronic)9783319118116
DOIs
StatePublished - 2014
Externally publishedYes
Event17th International Conference on Discovery Science, DS 2014 - Bled, Slovenia
Duration: 8 Oct 201410 Oct 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8777
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Discovery Science, DS 2014
Country/TerritorySlovenia
CityBled
Period8/10/1410/10/14

Bibliographical note

Publisher Copyright:
© Springer International Publishing Switzerland 2014.

Keywords

  • Instant Pattern Mining
  • Sampling Closed Itemsets
  • Tiling Databases

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Providing concise database covers instantly by recursive tile sampling'. Together they form a unique fingerprint.

Cite this