ADADELTA

ADADELTA: An Adaptive Learning Rate Method. We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning of a learning rate and appears robust to noisy gradient information, different model architecture choices, various data modalities and selection of hyperparameters. We show promising results compared to other methods on the MNIST digit classification task using a single machine and on a large scale voice dataset in a distributed cluster environment.


References in zbMATH (referenced in 69 articles )

Showing results 1 to 20 of 69.
Sorted by year (citations)

1 2 3 4 next

  1. Bai, Jinshuai; Zhou, Ying; Ma, Yuwei; Jeong, Hyogu; Zhan, Haifei; Rathnayaka, Charith; Sauret, Emilie; Gu, Yuantong: A general neural particle method for hydrodynamics modeling (2022)
  2. Basir, Shamsulhaq; Senocak, Inanc: Physics and equality constrained artificial neural networks: application to forward and inverse problems with multi-fidelity data fusion (2022)
  3. Chen, Jingrun; Jin, Shi; Lyu, Liyao: A consensus-based global optimization method with adaptive momentum estimation (2022)
  4. Clausen, Johan Bjerre Bach; Li, Hongyan: Big data driven order-up-to level model: application of machine learning (2022)
  5. Hu, C.; Martin, S.; Dingreville, R.: Accelerating phase-field predictions via recurrent neural networks learning the microstructure evolution in latent space (2022)
  6. Inage, Sin-ichi; Hebishima, Hana: Application of Monte Carlo stochastic optimization (MOST) to deep learning (2022)
  7. Jia, Yichen; Jeong, Jong-Hyeon: Deep learning for quantile regression under right censoring: deepquantreg (2022)
  8. Kim, Sehwan; Song, Qifan; Liang, Faming: Stochastic gradient Langevin dynamics with adaptive drifts (2022)
  9. Liu, Hailiang; Tian, Xuping: An adaptive gradient method with energy and momentum (2022)
  10. Ly, Duy K.; Truong, Tam T.; Nguyen-Thoi, T.: Multi-objective optimization of laminated functionally graded carbon nanotube-reinforced composite plates using deep feedforward neural networks-NSGAII algorithm (2022)
  11. Salti, Mehmet; Kangal, Evrim Ersin: Deep learning of CMB radiation temperature (2022)
  12. Sharrock, Louis; Kantas, Nikolas: Joint online parameter estimation and optimal sensor placement for the partially observed stochastic advection-diffusion equation (2022)
  13. Yan, Yonggui; Xu, Yangyang: Adaptive primal-dual stochastic gradient method for expectation-constrained convex stochastic programs (2022)
  14. De Loera, Jesús A.; Haddock, Jamie; Ma, Anna; Needell, Deanna: Data-driven algorithm selection and tuning in optimization and signal processing (2021)
  15. Feng, Miria; Feng, Wenying: Evaluation of parallel and sequential deep learning models for music subgenre classification (2021)
  16. Hyde, David A. B.; Bao, Michael; Fedkiw, Ronald: On obtaining sparse semantic solutions for inverse problems, control, and neural network training (2021)
  17. Liu, Yang; Roosta, Fred: Convergence of Newton-MR under inexact Hessian information (2021)
  18. Montiel, Jacob; Halford, Max; Mastelini, Saulo Martiello; Bolmier, Geoffrey; Sourty, Raphael; Vaysse, Robin; Zouitine, Adil; Gomes, Heitor Murilo; Read, Jesse; Abdessalem, Talel; Bifet, Albert: River: machine learning for streaming data in Python (2021)
  19. Nigri, Andrea; Levantesi, Susanna; Marino, Mario: Life expectancy and lifespan disparity forecasting: a long short-term memory approach (2021)
  20. Ogihara, Teppei: Misspecified diffusion models with high-frequency observations and an application to neural networks (2021)

1 2 3 4 next