SGDR: Stochastic Gradient Descent with Warm Restarts. Restart techniques are common in gradient-free optimization to deal with multimodal functions. Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions. In this paper, we propose a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks. We empirically study its performance on the CIFAR-10 and CIFAR-100 datasets, where we demonstrate new state-of-the-art results at 3.14% and 16.21%, respectively. We also demonstrate its advantages on a dataset of EEG recordings and on a downsampled version of the ImageNet dataset. Our source code is available at

References in zbMATH (referenced in 24 articles )

Showing results 1 to 20 of 24.
Sorted by year (citations)

1 2 next

  1. Ainsworth, Mark; Shin, Yeonjong: Active neuron least squares: a training method for multivariate rectified neural networks (2022)
  2. Gajek, Sebastian; Schneider, Matti; Böhlke, Thomas: An FE-DMN method for the multiscale analysis of thermomechanical composites (2022)
  3. Li, Shao-Yuan; Shi, Ye; Huang, Sheng-Jun; Chen, Songcan: Improving deep label noise learning with dual active label correction (2022)
  4. Wang, Bao; Nguyen, Tan; Sun, Tao; Bertozzi, Andrea L.; Baraniuk, Richard G.; Osher, Stanley J.: Scheduled restart momentum for accelerated stochastic gradient descent (2022)
  5. Yeo, Kyongmin; Li, Zan; Gifford, Wesley: Generative adversarial network for probabilistic forecast of random dynamical systems (2022)
  6. Bakhtin, Anton; Deng, Yuntian; Gross, Sam; Ott, Myle; Ranzato, Marc’aurelio; Szlam, Arthur: Residual energy-based models for text (2021)
  7. Bowen Zhang, Yidong Wang, Wenxin Hou, Hao Wu, Jindong Wang, Manabu Okumura, Takahiro Shinozaki: FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling (2021) arXiv
  8. Canayaz, Murat: C+effxnet: a novel hybrid approach for COVID-19 diagnosis on CT images based on CBAM and EfficientNet (2021)
  9. Changlin Li, Tao Tang, Guangrun Wang, Jiefeng Peng, Bing Wang, Xiaodan Liang, Xiaojun Chang: BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search (2021) arXiv
  10. Chatigny, Philippe; Patenaude, Jean-Marc; Wang, Shengrui: Spatiotemporal adaptive neural network for long-term forecasting of financial time series (2021)
  11. Gajek, Sebastian; Schneider, Matti; Böhlke, Thomas: An FE-DMN method for the multiscale analysis of short fiber reinforced plastic components (2021)
  12. Qingzhong Wang, Pengfei Zhang, Haoyi Xiong, Jian Zhao: Face.evoLVe: A High-Performance Face Recognition Library (2021) arXiv
  13. Rawson, Michael; Reger, Giles: \textsflazyCoP: lazy paramodulation meets neurally guided search (2021)
  14. Theresa Eimer, André Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, Marius Lindauer: DACBench: A Benchmark Library for Dynamic Algorithm Configuration (2021) arXiv
  15. Yeo, Kyongmin; Grullon, Dylan E. C.; Sun, Fan-Keng; Boning, Duane S.; Kalagnanam, Jayant R.: Variational inference formulation for a model-free simulation of a dynamical system with unknown parameters by a recurrent neural network (2021)
  16. Banert, Sebastian; Ringh, Axel; Adler, Jonas; Karlsson, Johan; Öktem, Ozan: Data-driven nonsmooth optimization (2020)
  17. Chen, Yiming; Pan, Tianci; He, Cheng; Cheng, Ran: Efficient evolutionary deep neural architecture search (NAS) by noisy network morphism mutation (2020)
  18. Kang, Dongseok; Ahn, Chang Wook: Efficient neural network space with genetic search (2020)
  19. Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel: FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence (2020) arXiv
  20. Mohamed, Shakir; Rosca, Mihaela; Figurnov, Michael; Mnih, Andriy: Monte Carlo gradient estimation in machine learning (2020)

1 2 next