k-means++
k-means++: The advantages of careful seeding. The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically.
Keywords for this software
References in zbMATH (referenced in 113 articles )
Showing results 1 to 20 of 113.
Sorted by year (- Ahmadian, Sara; Norouzi-Fard, Ashkan; Svensson, Ola; Ward, Justin: Better guarantees for (k)-means and Euclidean (k)-median by primal-dual algorithms (2020)
- Bunea, Florentina; Giraud, Christophe; Luo, Xi; Royer, Martin; Verzelen, Nicolas: Model assisted variable clustering: minimax-optimal recovery and algorithms (2020)
- Capó, Marco; Pérez, Aritz; Lozano, Jose A.: An efficient (K)-means clustering algorithm for tall data (2020)
- Ding, Hu; Xu, Jinhui: A unified framework for clustering constrained data without locality property (2020)
- Duan, Leo L.: Latent simplex position model: high dimensional multi-view clustering with uncertainty quantification (2020)
- Feldman, Dan; Schmidt, Melanie; Sohler, Christian: Turning big data into tiny data: constant-size coresets for (k)-means, PCA, and projective clustering (2020)
- Guillaume, Serge; Ros, Frédéric: A family of unsupervised sampling algorithms (2020)
- Hosseini, Reshad; Sra, Suvrit: An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization (2020)
- Ling, Shuyang; Strohmer, Thomas: Certifying global optimality of graph cuts via semidefinite relaxation: a performance guarantee for spectral clustering (2020)
- Lü, Hongliang; Arbel, Julyan; Forbes, Florence: Bayesian nonparametric priors for hidden Markov random fields (2020)
- Otsuka, Hajime; Takemoto, Kenta: Deep learning and k-means clustering in heterotic string vacua with line bundles (2020)
- Rezaei, Mohammad: Improving a centroid-based clustering by using suitable centroids from another clustering (2020)
- Sayed, Gehad Ismail; Darwish, Ashraf; Hassanien, Aboul Ella: Binary whale optimization algorithm and binary moth flame optimization with clustering algorithms for clinical breast cancer diagnoses (2020)
- Simpson, Edwin; Gurevych, Iryna: Scalable Bayesian preference learning for crowds (2020)
- Tremblay, Nicolas; Loukas, Andreas: Approximating spectral clustering via sampling: a review (2020)
- Wang, Dingkang; Wang, Yusu: An improved cost function for hierarchical cluster trees (2020)
- Yu, Jaehong; Zhong, Hua; Kim, Seoung Bum: An ensemble feature ranking algorithm for clustering analysis (2020)
- Chen, Liyuan; Li, Yutian; Zeng, Tieyong: Variational image restoration and segmentation with Rician noise (2019)
- Dasgupta, Agnimitra; Ghosh, Debraj: Failure probability estimation of linear time varying systems by progressive refinement of reduced order models (2019)
- Eisenach, Carson; Liu, Han: Efficient, certifiably optimal clustering with applications to latent variable graphical models (2019)