R package BayClone2: Estimating latent cell subpopulations with Bayesian feature allocation models. Tumor cells are genetically heterogeneous. The collection of the entire tumor cell population consists of different subclones that can be characterized by mutations in sequence and structure at various genomic locations. Using next-generation sequencing data, we characterize tumor heterogeneity using Bayesian nonparametric inference. Specifically, we estimate the number of subclones in a tumor sample, and for each subclone, we estimate the subclonal copy number and single nucleotide mutations at a selected set of loci. Posterior summaries are presented in three matrices, namely, the matrix of subclonal copy numbers ((oldsymbol{L})), subclonal variant alleles ((oldsymbol{Z})), and the population frequencies of the subclones ((oldsymbol{w})). The proposed method can handle a single or multiple tumor samples. Computation via Markov chain Monte Carlo yields posterior Monte Carlo samples of all three matrices, allowing for the assessment of any desired inference summary. Simulation and real-world examples are provided as illustration. An R package is available at url{}.

Keywords for this software

Anything in here will be replaced on browsers that support the canvas element