k-means++: The advantages of careful seeding. The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically.

References in zbMATH (referenced in 134 articles )

Showing results 1 to 20 of 134.
Sorted by year (citations)

1 2 3 ... 5 6 7 next

  1. Deshpande, Amit; Pratap, Rameshwar: Sampling-based dimension reduction for subspace approximation with outliers (2021)
  2. Gangloff, Hugo; Courbot, Jean-Baptiste; Monfrini, Emmanuel; Collet, Christophe: Unsupervised image segmentation with Gaussian pairwise Markov fields (2021)
  3. Jiang, Xiaoping; Bai, Ruibin; Wallace, Stein W.; Kendall, Graham; Landa-Silva, Dario: Soft clustering-based scenario bundling for a progressive hedging heuristic in stochastic service network design (2021)
  4. Juan-Albarracín, Javier; Fuster-Garcia, Elies; Juan, Alfons; García-Gómez, Juan M.: Non-local spatially varying finite mixture models for image segmentation (2021)
  5. Wang, Minjie; Allen, Genevera I.: Integrative generalized convex clustering optimization and feature selection for mixed multi-view data (2021)
  6. Zhang, Dongmei; Cheng, Yukun; Li, Min; Wang, Yishui; Xu, Dachuan: Approximation algorithms for spherical (k)-means problem using local search scheme (2021)
  7. Zhang, Tonglin; Lin, Ge: Generalized (k)-means in GLMs with applications to the outbreak of COVID-19 in the United States (2021)
  8. Ahmadian, Sara; Norouzi-Fard, Ashkan; Svensson, Ola; Ward, Justin: Better guarantees for (k)-means and Euclidean (k)-median by primal-dual algorithms (2020)
  9. Bunea, Florentina; Giraud, Christophe; Luo, Xi; Royer, Martin; Verzelen, Nicolas: Model assisted variable clustering: minimax-optimal recovery and algorithms (2020)
  10. Capó, Marco; Pérez, Aritz; Lozano, Jose A.: An efficient (K)-means clustering algorithm for tall data (2020)
  11. Ding, Hu; Xu, Jinhui: A unified framework for clustering constrained data without locality property (2020)
  12. Duan, Leo L.: Latent simplex position model: high dimensional multi-view clustering with uncertainty quantification (2020)
  13. Feldman, Dan; Schmidt, Melanie; Sohler, Christian: Turning big data into tiny data: constant-size coresets for (k)-means, PCA, and projective clustering (2020)
  14. Feng, Ben Mingbin; Tan, Zhenni; Zheng, Jiayi: Efficient simulation designs for valuation of large variable annuity portfolios (2020)
  15. Guillaume, Serge; Ros, Frédéric: A family of unsupervised sampling algorithms (2020)
  16. Hämäläinen, Joonas; Alencar, Alisson S. C.; Kärkkäinen, Tommi; Mattos, César L. C.; Souza Júnior, Amauri H.; Gomes, João P. P.: Minimal learning machine: theoretical results and clustering-based reference point selection (2020)
  17. Hosseini, Reshad; Sra, Suvrit: An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization (2020)
  18. Irmatov, Anvar Adkhamovich; Irmatova, Èl’nura Anvarovna: Estimation of the inclusive development index based on the REL-PCANet neural network model (2020)
  19. Irons, Linda; Huang, Huang; Owen, Markus R.; O’Dea, Reuben D.; Meininger, Gerald A.; Brook, Bindi S.: Switching behaviour in vascular smooth muscle cell-matrix adhesion during oscillatory loading (2020)
  20. Kazakovtsev, Lev; Rozhnov, Ivan; Shkaberina, Guzel; Orlov, Viktor: (k)-means genetic algorithms with greedy genetic operators (2020)

1 2 3 ... 5 6 7 next