Abstract
Known pattern discovery algorithms for finding tilings (covers of 0/1-databases consisting of 1-rectangles) cannot be integrated in instant and interactive KD tools, because they do not satisfy at least one of two key requirements: a) to provide results within a short response time of only a few seconds and b) to return a concise set of patterns with only a few elements that nevertheless covers a large fraction of the input database. In this paper we present a novel randomized algorithm that works well under these requirements. It is based on the recursive application of a simple tile sample procedure that can be implemented efficiently using rejection sampling. While, as we analyse, the theoretical solution distribution can be weak in the worst case, the approach performs very well in practice and outperforms previous sampling as well as deterministic algorithms.
Original language | English |
---|---|
Title of host publication | Discovery Science - 17th International Conference, DS 2014, Proceedings |
Editors | Sašo Džeroski, Panče Panov, Dragi Kocev, Ljupčo Todorovski |
Publisher | Springer Verlag |
Pages | 216-227 |
Number of pages | 12 |
ISBN (Electronic) | 9783319118116 |
DOIs | |
State | Published - 2014 |
Externally published | Yes |
Event | 17th International Conference on Discovery Science, DS 2014 - Bled, Slovenia Duration: 8 Oct 2014 → 10 Oct 2014 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 8777 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 17th International Conference on Discovery Science, DS 2014 |
---|---|
Country/Territory | Slovenia |
City | Bled |
Period | 8/10/14 → 10/10/14 |
Bibliographical note
Publisher Copyright:© Springer International Publishing Switzerland 2014.
Keywords
- Instant Pattern Mining
- Sampling Closed Itemsets
- Tiling Databases
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science