R package boost: BagBoosting for tumor classification with gene expression data. Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertainty. A very promising solution is to combine the two ensemble schemes bagging and boosting to a novel algorithm called BagBoosting. Results: When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data. This quasi-guaranteed improvement can be obtained by simply making a bigger computing effort. The advantageous predictive potential is also confirmed by comparing BagBoosting to several established class prediction tools for microarray data. Availability: Software for the modified boosting algorithms, for benchmark studies and for the simulation of microarray data are available as an R package under GNU public license at dettling/bagboost.html

References in zbMATH (referenced in 36 articles )

Showing results 1 to 20 of 36.
Sorted by year (citations)

1 2 next

  1. Cai, Jia; Huo, Junyi: Sparse generalized canonical correlation analysis via linearized Bregman method (2020)
  2. Huo, Yanhao; Xin, Lihui; Kang, Chuanze; Wang, Minghui; Ma, Qin; Yu, Bin: SGL-SVM: a novel method for tumor classification via support vector machine with sparse group lasso (2020)
  3. Yang, Aijun; Tian, Yuzhu; Li, Yunxian; Lin, Jinguan: Sparse Bayesian variable selection in kernel probit model for analyzing high-dimensional data (2020)
  4. Yin, Zanhua: Variable selection for sparse logistic regression (2020)
  5. Jiang, Binyan; Wang, Xiangyu; Leng, Chenlei: A direct approach for sparse quadratic discriminant analysis (2018)
  6. Arias-Castro, Ery; Pu, Xiao: A simple approach to sparse clustering (2017)
  7. Bertsimas, Dimitris; King, Angela; Mazumder, Rahul: Best subset selection via a modern optimization lens (2016)
  8. Cheng, Lulu; Kim, Inyoung; Pang, Herbert: Bayesian semiparametric model for pathway-based analysis with zero-inflated clinical outcomes (2016)
  9. Fan, Yan; Gai, Yujie; Zhu, Lixing: Asymtotics of Dantzig selector for a general single-index model (2016)
  10. Safo, Sandra E.; Ahn, Jeongyoun: General sparse multi-class linear discriminant analysis (2016)
  11. Ahn, Jeongyoun; Jeon, Yongho: Sparse HDLSS discrimination with constrained data piling (2015)
  12. Donoho, David; Jin, Jiashun: Higher criticism for large-scale inference, especially for rare and weak effects (2015)
  13. Müller, Patric; van de Geer, Sara: The partial linear model in high dimensions (2015)
  14. Pamukçu, Esra; Bozdogan, Hamparsum; Çalık, Sinan: A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification (2015)
  15. Wang, Tao; Zhu, LiXing: A distribution-based Lasso for a general single-index model (2015)
  16. Yang, Aijun; Li, Yunxian; Tang, Niansheng; Lin, Jinguan: Bayesian variable selection in multinomial probit model for classifying high-dimensional data (2015)
  17. Bühlmann, Peter; Mandozzi, Jacopo: High-dimensional variable screening and bias in subsequent inference, with an empirical comparison (2014)
  18. Hall, Peter; Jin, Jiashun; Miller, Hugh: Feature selection when there are many influential features (2014)
  19. Kim, Kyung In; Simon, Richard: Overfitting, generalization, and MSE in class probability estimation with high-dimensional data (2014)
  20. Roberts, S.; Nowak, G.: Stabilizing the lasso against cross-validation variability (2014)

1 2 next