Abstract
The input to the line-sets k-median problem is an integer k ≥ 1, and a set L = {L1,..., Ln} that contains n sets of lines in Rd. The goal is to compute a set C of k centers (points in Rd) that minimizes the sum ΣL∈L minℓ∈L,c∈C dist(ℓ, c) of Euclidean distances from each set to its closest center, where dist(ℓ, c):= minx∈ℓ ∥x - c∥2. An ε-coreset for this problem is a weighted subset of sets in L that approximates this sum up to 1 ± ε multiplicative factor, for every set C of k centers. We prove that every such input set L has a small ε-coreset, and provide the first coreset construction for this problem and its variants. The coreset consists of O(log2 n) weighted line-sets from L, and is constructed in O(n log n) time for every fixed d, k ≥ 1 and ε ∈ (0, 1). The main technique is based on a novel reduction to a “fair clustering” of colored points to colored centers. We then provide a coreset for this coloring problem, which may be of independent interest. Open source code and experiments are also provided.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022 |
| Editors | S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh |
| Publisher | Neural information processing systems foundation |
| ISBN (Electronic) | 9781713871088 |
| State | Published - 2022 |
| Event | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States Duration: 28 Nov 2022 → 9 Dec 2022 |
Publication series
| Name | Advances in Neural Information Processing Systems |
|---|---|
| Volume | 35 |
| ISSN (Print) | 1049-5258 |
Conference
| Conference | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 |
|---|---|
| Country/Territory | United States |
| City | New Orleans |
| Period | 28/11/22 → 9/12/22 |
Bibliographical note
Publisher Copyright:© 2022 Neural information processing systems foundation. All rights reserved.
ASJC Scopus subject areas
- Signal Processing
- Information Systems
- Computer Networks and Communications
Fingerprint
Dive into the research topics of 'Coreset for Line-Sets Clustering'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver