BCBCSF

Bias-corrected hierarchical Bayesian classification with a selected subset of high-dimensional features Class prediction based on high-dimensional features has received a great deal of attention in many areas of application. For example, biologists are interested in using microarray gene expression profiles for diagnosis or prognosis of a certain disease (e.g., cancer). For computational and other reasons, it is necessary to select a subset of features before fitting a statistical model, by evaluating how strongly the features are related to the response. However, such a feature selection procedure will result in overconfident predictive probabilities for future cases, because the signal-to-noise ratio in the retained features is exacerbated by the feature selection. In this article we develop a hierarchical Bayesian classification method that can correct for this feature selection bias. Our method, which we term bias-corrected Bayesian classification with selected features (BCBCSF), uses the partial information from the feature selection procedure, in addition to the retained features, to form a correct (unbiased) posterior distribution of certain hyperparameters in the hierarchical Bayesian model that control the signal-to-noise ratio of the dataset. We take a Markov chain Monte Carlo (MCMC) approach to inferring the model parameters. We then use MCMC samples to make predictions for future cases. Because of the simplicity of the models, the inferred parameters from MCMC are easy to interpret, and the computation is very fast. Simulation studies and tests with two real microarray datasets related to complex human diseases show that our BCBCSF method provides better predictions than two widely used high-dimensional classification methods, prediction analysis for microarrays and diagonal linear discriminant analysis. The R package BCBCSF for the method described here is available from http://math.usask.ca/longhai/software/BCBCSF and CRAN.

References in zbMATH (referenced in 1 article , 1 standard article )

Showing result 1 of 1.
Sorted by year (citations)

  1. Li, Longhai: Bias-corrected hierarchical Bayesian classification with a selected subset of high-dimensional features (2012)