CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR .
Keywords for this software
References in zbMATH (referenced in 5 articles )
Showing results 1 to 5 of 5.
- Lin, Zhixiang; Zamanighomi, Mahdi; Daley, Timothy; Ma, Shining; Wong, Wing Hung: Model-based approach to the joint analysis of single-cell data on chromatin accessibility and gene expression (2020)
- Liu, Yiyi; Warren, Joshua L.; Zhao, Hongyu: A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data (2019)
- Park, Seyoung; Zhao, Hongyu: Sparse principal component analysis with missing observations (2019)
- Suner, Aslı: Clustering methods for single-cell RNA-sequencing expression data: performance evaluation with varying sample sizes and cell compositions (2019)
- Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn: A unified statistical framework for single cell and bulk RNA sequencing data (2018)