Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Variable selection can be challenging, particularly in situations with a large number of predictors with possibly high correlations, such as gene expression data. In this article, a new method, called OSCAR (octagonal shrinkage and clustering algorithm for regression), is proposed to simultaneously select variables while grouping them into predictive clusters. In addition to improving prediction accuracy and interpretation, these resulting groups can then be investigated further to discover what contributes to the group having a similar behavior. The technique is based on penalized least squares with a geometrically intuitive penalty function that shrinks some coefficients to exactly zero. Additionally, this penalty yields exact equality of some coefficients, encouraging correlated predictors that have a similar effect on the response to form predictive clusters represented by a single coefficient. The proposed procedure is shown to compare favorably to the existing shrinkage and variable selection techniques in terms of both prediction error and model complexity, while yielding the additional grouping information.

References in zbMATH (referenced in 58 articles , 1 standard article )

Showing results 1 to 20 of 58.
Sorted by year (citations)

1 2 3 next

  1. Celentano, Michael; Montanari, Andrea: Fundamental barriers to high-dimensional regression with convex penalties (2022)
  2. He, Xiaohong; Yang, Yaohong; Wang, Lei: Generalised regression estimators for average treatment effect with multicollinearity in high-dimensional covariates (2022)
  3. Liu, Y. C.; Xia, F. Q.: Linear convergence of proximal incremental aggregated gradient method for nonconvex nonsmooth minimization problems (2022)
  4. Chen, Jia; Li, Degui; Wei, Lingling; Zhang, Wenyang: Nonparametric homogeneity pursuit in functional-coefficient models (2021)
  5. Cui, Qiurong; Xu, Yuqing; Zhang, Zhengjun; Chan, Vincent: Max-linear regression models with regularization (2021)
  6. Li, Yaguang; Xu, Wei; Gao, Xin: Graphical-model based high dimensional generalized linear models (2021)
  7. Rezaei, Mostafa; Cribben, Ivor; Samorani, Michele: A clustering-based feature selection method for automatically generated relational attributes (2021)
  8. Ye, Jane J.; Yuan, Xiaoming; Zeng, Shangzhi; Zhang, Jin: Variational analysis perspective on linear convergence of some first order methods for nonsmooth convex optimization problems (2021)
  9. Calderon, Hernan; Santibañez, Felipe; Silva, Jorge F.; Ortiz, Julián M.; Egaña, Alvaro: Geological facies recovery based on weighted (\ell_1)-regularization (2020)
  10. Do, Hyungrok; Cheon, Myun-Seok; Kim, Seoung Bum: Graph structured sparse subset selection (2020)
  11. Li, Yuan; Mark, Benjamin; Raskutti, Garvesh; Willett, Rebecca; Song, Hyebin; Neiman, David: Graph-based regularization for regression problems with alignment and highly correlated designs (2020)
  12. Ren, Sheng; Kang, Emily L.; Lu, Jason L.: MCEN: a method of simultaneous variable selection and clustering for high-dimensional multinomial regression (2020)
  13. Sherwood, Ben; Molstad, Aaron J.; Singha, Sumanta: Asymptotic properties of concave (L_1)-norm group penalties (2020)
  14. Yue, Mu; Huang, Lei: A new approach of subgroup identification for high-dimensional longitudinal data (2020)
  15. Chakraborty, Sounak; Lozano, Aurelie C.: A graph Laplacian prior for Bayesian variable selection and grouping (2019)
  16. Chi, Eric C.; Steinerberger, Stefan: Recovering trees with convex clustering (2019)
  17. Lederer, Johannes; Yu, Lu; Gaynanova, Irina: Oracle inequalities for high-dimensional prediction (2019)
  18. Liu, Jianyu; Yu, Guan; Liu, Yufeng: Graph-based sparse linear discriminant analysis for high-dimensional classification (2019)
  19. Zhang, Yingying; Wang, Huixia Judy; Zhu, Zhongyi: Quantile-regression-based clustering for panel data (2019)
  20. Hui, Francis K. C.; Müller, Samuel; Welsh, A. H.: Sparse pairwise likelihood estimation for multivariate longitudinal mixed models (2018)

1 2 3 next