k-means++: The advantages of careful seeding. The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically.

References in zbMATH (referenced in 139 articles )

Showing results 1 to 20 of 139.
Sorted by year (citations)

1 2 3 ... 5 6 7 next

  1. Brécheteau, Claire; Fischer, Aurélie; Levrard, Clément: Robust Bregman clustering (2021)
  2. Deshpande, Amit; Pratap, Rameshwar: Sampling-based dimension reduction for subspace approximation with outliers (2021)
  3. Gangloff, Hugo; Courbot, Jean-Baptiste; Monfrini, Emmanuel; Collet, Christophe: Unsupervised image segmentation with Gaussian pairwise Markov fields (2021)
  4. Jiang, Xiaoping; Bai, Ruibin; Wallace, Stein W.; Kendall, Graham; Landa-Silva, Dario: Soft clustering-based scenario bundling for a progressive hedging heuristic in stochastic service network design (2021)
  5. Juan-Albarracín, Javier; Fuster-Garcia, Elies; Juan, Alfons; García-Gómez, Juan M.: Non-local spatially varying finite mixture models for image segmentation (2021)
  6. Klimenko, Georgiy; Raichel, Benjamin; Van Buskirk, Gregory: Sparse convex hull coverage (2021)
  7. Liu, Qian; Liu, Jianxin; Li, Min; Zhou, Yang: Approximation algorithms for fuzzy (C)-means problem based on seeding method (2021)
  8. Wang, Minjie; Allen, Genevera I.: Integrative generalized convex clustering optimization and feature selection for mixed multi-view data (2021)
  9. Zhang, Dongmei; Cheng, Yukun; Li, Min; Wang, Yishui; Xu, Dachuan: Approximation algorithms for spherical (k)-means problem using local search scheme (2021)
  10. Zhang, Tonglin; Lin, Ge: Generalized (k)-means in GLMs with applications to the outbreak of COVID-19 in the United States (2021)
  11. Ahmadian, Sara; Norouzi-Fard, Ashkan; Svensson, Ola; Ward, Justin: Better guarantees for (k)-means and Euclidean (k)-median by primal-dual algorithms (2020)
  12. Bunea, Florentina; Giraud, Christophe; Luo, Xi; Royer, Martin; Verzelen, Nicolas: Model assisted variable clustering: minimax-optimal recovery and algorithms (2020)
  13. Capó, Marco; Pérez, Aritz; Lozano, Jose A.: An efficient (K)-means clustering algorithm for tall data (2020)
  14. Ding, Hu; Xu, Jinhui: A unified framework for clustering constrained data without locality property (2020)
  15. Duan, Leo L.: Latent simplex position model: high dimensional multi-view clustering with uncertainty quantification (2020)
  16. Feldman, Dan; Schmidt, Melanie; Sohler, Christian: Turning big data into tiny data: constant-size coresets for (k)-means, PCA, and projective clustering (2020)
  17. Feng, Ben Mingbin; Tan, Zhenni; Zheng, Jiayi: Efficient simulation designs for valuation of large variable annuity portfolios (2020)
  18. Guillaume, Serge; Ros, Frédéric: A family of unsupervised sampling algorithms (2020)
  19. Hämäläinen, Joonas; Alencar, Alisson S. C.; Kärkkäinen, Tommi; Mattos, César L. C.; Souza Júnior, Amauri H.; Gomes, João P. P.: Minimal learning machine: theoretical results and clustering-based reference point selection (2020)
  20. Hosseini, Reshad; Sra, Suvrit: An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization (2020)

1 2 3 ... 5 6 7 next