R package boost: BagBoosting for tumor classification with gene expression data. Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertainty. A very promising solution is to combine the two ensemble schemes bagging and boosting to a novel algorithm called BagBoosting. Results: When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data. This quasi-guaranteed improvement can be obtained by simply making a bigger computing effort. The advantageous predictive potential is also confirmed by comparing BagBoosting to several established class prediction tools for microarray data. Availability: Software for the modified boosting algorithms, for benchmark studies and for the simulation of microarray data are available as an R package under GNU public license at dettling/bagboost.html

References in zbMATH (referenced in 36 articles )

Showing results 21 to 36 of 36.
Sorted by year (citations)
  1. Bühlmann, Peter: Statistical significance in high-dimensional linear models (2013)
  2. Telaar, Anna; Repsilber, Dirk; Nürnberg, Gerd: Biomarker discovery: classification using pooled samples (2013)
  3. Wang, Tao; Zhu, Lixing: Sparse sufficient dimension reduction using optimal scoring (2013)
  4. Sun, Wei; Wang, Junhui; Fang, Yixin: Regularized (k)-means clustering of high-dimensional data and its asymptotic consistency (2012)
  5. Zhang, Chun-Xia; Wang, Guan-Wei; Zhang, Jiang-She: An empirical bias-variance analysis of DECORATE ensemble method at different training sample sizes (2012)
  6. Kabán, Ata: On the distance concentration awareness of certain data reduction techniques (2011)
  7. Huang, Song; Tong, Tiejun; Zhao, Hongyu: Bias-corrected diagonal discriminant rules for high-dimensional classification (2010)
  8. Tuna, Salih; Niranjan, Mahesan: Inference from low precision transcriptome data representation (2010) ioport
  9. Wu, Tong Tong; Lange, Kenneth: Multicategory vertex discriminant analysis for high-dimensional data (2010)
  10. Durrant, Robert J.; Kabán, Ata: When is `nearest neighbour’ meaningful: A converse theorem and implications (2009)
  11. Hong, Jin-Hyuk; Cho, Sung-Bae: Gene boosting for cancer classification based on gene expression profiles (2009)
  12. Shim, Jooyong; Sohn, Insuk; Kim, Sujong; Lee, Jae Won; Green, Paul E.; Hwang, Changha: Selecting marker genes for cancer classification using supervised weighted kernel clustering and the support vector machine (2009)
  13. Jin, Xin; Xu, Anbang; Bie, Rongfang: Cancer classification from serial analysis of gene expression with event models (2008) ioport
  14. Leng, Chenlei: Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data (2008)
  15. Bühlmann, Peter; Hothorn, Torsten: Boosting algorithms: regularization, prediction and model fitting (2007)
  16. Bühlmann, Peter; Hothorn, Torsten: Rejoinder: Boosting algorithms: regularization, prediction and model fitting (2007)