Abstract
(j, k)-projective clustering is the natural generalization of the family of k-clustering and j-subspace clustering problems. Given a set of points P in Rd, the goal is to find k flats of dimension j, i.e., affine subspaces, that best fit P under a given distance measure. In this paper, we propose the first algorithm that returns an L∞ coreset of size polynomial in d. Moreover, we give the first strong coreset construction for general M-estimator regression. Specifically, we show that our construction provides efficient coreset constructions for Cauchy, Welsch, Huber, Geman-McClure, Tukey, L1 − L2, and Fair regression, as well as general concave and power-bounded loss functions. Finally, we provide experimental results based on real-world datasets, showing the efficacy of our approach.
Original language | English |
---|---|
Pages (from-to) | 5391-5415 |
Number of pages | 25 |
Journal | Proceedings of Machine Learning Research |
Volume | 151 |
State | Published - 2022 |
Event | 25th International Conference on Artificial Intelligence and Statistics, AISTATS 2022 - Virtual, Online, Spain Duration: 28 Mar 2022 → 30 Mar 2022 |
Bibliographical note
Funding Information:This research was partially supported by the Israel National Cyber Directorate via the BIU Center for Applied Research in Cyber Security, and supported in part by NSF CAREER grant 1652257, NSF grant 1934979, ONR Award N00014-18-1-2364 and the Lifelong Learning Machines program from DARPA/MTO. In addition, Samson Zhou would like to thank National Institute of Health grant 5401 HG 10798-2 and a Simons Investigator Award of David P. Woodruff.
Publisher Copyright:
Copyright © 2022 by the author(s)
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability