k-means++: The advantages of careful seeding. The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically.

References in zbMATH (referenced in 119 articles )

Showing results 1 to 20 of 119.
Sorted by year (citations)

1 2 3 4 5 6 next

  1. Ahmadian, Sara; Norouzi-Fard, Ashkan; Svensson, Ola; Ward, Justin: Better guarantees for (k)-means and Euclidean (k)-median by primal-dual algorithms (2020)
  2. Bunea, Florentina; Giraud, Christophe; Luo, Xi; Royer, Martin; Verzelen, Nicolas: Model assisted variable clustering: minimax-optimal recovery and algorithms (2020)
  3. Capó, Marco; Pérez, Aritz; Lozano, Jose A.: An efficient (K)-means clustering algorithm for tall data (2020)
  4. Ding, Hu; Xu, Jinhui: A unified framework for clustering constrained data without locality property (2020)
  5. Duan, Leo L.: Latent simplex position model: high dimensional multi-view clustering with uncertainty quantification (2020)
  6. Feldman, Dan; Schmidt, Melanie; Sohler, Christian: Turning big data into tiny data: constant-size coresets for (k)-means, PCA, and projective clustering (2020)
  7. Feng, Ben Mingbin; Tan, Zhenni; Zheng, Jiayi: Efficient simulation designs for valuation of large variable annuity portfolios (2020)
  8. Guillaume, Serge; Ros, Frédéric: A family of unsupervised sampling algorithms (2020)
  9. Hosseini, Reshad; Sra, Suvrit: An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization (2020)
  10. Irons, Linda; Huang, Huang; Owen, Markus R.; O’Dea, Reuben D.; Meininger, Gerald A.; Brook, Bindi S.: Switching behaviour in vascular smooth muscle cell-matrix adhesion during oscillatory loading (2020)
  11. Ling, Shuyang; Strohmer, Thomas: Certifying global optimality of graph cuts via semidefinite relaxation: a performance guarantee for spectral clustering (2020)
  12. Lü, Hongliang; Arbel, Julyan; Forbes, Florence: Bayesian nonparametric priors for hidden Markov random fields (2020)
  13. Otsuka, Hajime; Takemoto, Kenta: Deep learning and k-means clustering in heterotic string vacua with line bundles (2020)
  14. Rezaei, Mohammad: Improving a centroid-based clustering by using suitable centroids from another clustering (2020)
  15. Sayed, Gehad Ismail; Darwish, Ashraf; Hassanien, Aboul Ella: Binary whale optimization algorithm and binary moth flame optimization with clustering algorithms for clinical breast cancer diagnoses (2020)
  16. Simpson, Edwin; Gurevych, Iryna: Scalable Bayesian preference learning for crowds (2020)
  17. Škrlj, Blaž; Kralj, Jan; Lavrač, Nada: Embedding-based silhouette community detection (2020)
  18. Tremblay, Nicolas; Loukas, Andreas: Approximating spectral clustering via sampling: a review (2020)
  19. Wang, Dingkang; Wang, Yusu: An improved cost function for hierarchical cluster trees (2020)
  20. Yu, Jaehong; Zhong, Hua; Kim, Seoung Bum: An ensemble feature ranking algorithm for clustering analysis (2020)

1 2 3 4 5 6 next