TY - GEN
T1 - A randomized approach for approximating the number of frequent sets
AU - Boley, Mario
AU - Grosskreutz, Henrik
PY - 2008
Y1 - 2008
N2 - We investigate the problem of counting the number of frequent (item)sets-a problem known to be intractable in terms of an exact polynomial time computation. In this paper, we show that it is in general also hard to approximate. Subsequently, a randomized counting algorithm is developed using the Markov chain Monte Carlo method. While for general inputs an exponential running time is needed in order to guarantee a certain approximation bound, we empirically show that the algorithm still has the desired accuracy on real-world datasets when its running time is capped polynomially.
AB - We investigate the problem of counting the number of frequent (item)sets-a problem known to be intractable in terms of an exact polynomial time computation. In this paper, we show that it is in general also hard to approximate. Subsequently, a randomized counting algorithm is developed using the Markov chain Monte Carlo method. While for general inputs an exponential running time is needed in order to guarantee a certain approximation bound, we empirically show that the algorithm still has the desired accuracy on real-world datasets when its running time is capped polynomially.
UR - http://www.scopus.com/inward/record.url?scp=67049119060&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2008.85
DO - 10.1109/ICDM.2008.85
M3 - Conference contribution
AN - SCOPUS:67049119060
SN - 9780769535029
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 43
EP - 52
BT - Proceedings - 8th IEEE International Conference on Data Mining, ICDM 2008
T2 - 8th IEEE International Conference on Data Mining, ICDM 2008
Y2 - 15 December 2008 through 19 December 2008
ER -