TY - GEN
T1 - A PTAS for k-means clustering based on weak coresets
AU - Feldman, Dan
AU - Monemizadeh, Morteza
AU - Sohler, Christian
PY - 2007
Y1 - 2007
N2 - Given a point set P Rd the k-means clustering problem is to find a set C=(c1,...,ck) of k points and a partition of P into k clusters C1,...,Ck such that the sum of squared errors i=1k p Ci |p -ci |2 2 is minimized. For given centers this cost function is minimized byassigning points to the nearest center.The k-means cost function is probably the most widely used cost function in the area of clustering.In this paper we show that every unweighted point set P has a weak (, k)-coreset of size Poly(k,1/) for the k-means clustering problem, i.e. its size is independent of the cardinality |P| of the point set and the dimension d of the Euclidean space Rd. A weak coreset is a weighted set S P together with a set T such that T contains a (1+)-approximation for the optimal cluster centers from P and for every set of kcenters from T the cost of the centers for S is a (1±)-approximation of the cost for P.We apply our weak coreset to obtain a PTAS for the k-means clustering problem with running time O(nkd + d Poly(k/) + 2Õ(k/ε)).
AB - Given a point set P Rd the k-means clustering problem is to find a set C=(c1,...,ck) of k points and a partition of P into k clusters C1,...,Ck such that the sum of squared errors i=1k p Ci |p -ci |2 2 is minimized. For given centers this cost function is minimized byassigning points to the nearest center.The k-means cost function is probably the most widely used cost function in the area of clustering.In this paper we show that every unweighted point set P has a weak (, k)-coreset of size Poly(k,1/) for the k-means clustering problem, i.e. its size is independent of the cardinality |P| of the point set and the dimension d of the Euclidean space Rd. A weak coreset is a weighted set S P together with a set T such that T contains a (1+)-approximation for the optimal cluster centers from P and for every set of kcenters from T the cost of the centers for S is a (1±)-approximation of the cost for P.We apply our weak coreset to obtain a PTAS for the k-means clustering problem with running time O(nkd + d Poly(k/) + 2Õ(k/ε)).
KW - Approximation
KW - Coresets
KW - Geometric optimization
KW - K-mean
UR - http://www.scopus.com/inward/record.url?scp=35348830377&partnerID=8YFLogxK
U2 - 10.1145/1247069.1247072
DO - 10.1145/1247069.1247072
M3 - Conference contribution
AN - SCOPUS:35348830377
SN - 1595937056
SN - 9781595937056
T3 - Proceedings of the Annual Symposium on Computational Geometry
SP - 11
EP - 18
BT - Proceedings of the Twenty-third Annual Symposium on Computational Geometry, SCG'07
T2 - 23rd Annual Symposium on Computational Geometry, SCG'07
Y2 - 6 June 2007 through 8 June 2007
ER -