ADADELTA: An Adaptive Learning Rate Method. We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning of a learning rate and appears robust to noisy gradient information, different model architecture choices, various data modalities and selection of hyperparameters. We show promising results compared to other methods on the MNIST digit classification task using a single machine and on a large scale voice dataset in a distributed cluster environment.

References in zbMATH (referenced in 45 articles )

Showing results 1 to 20 of 45.
Sorted by year (citations)

1 2 3 next

  1. Jia, Yichen; Jeong, Jong-Hyeon: Deep learning for quantile regression under right censoring: deepquantreg (2022)
  2. De Loera, Jesús A.; Haddock, Jamie; Ma, Anna; Needell, Deanna: Data-driven algorithm selection and tuning in optimization and signal processing (2021)
  3. Liu, Yang; Roosta, Fred: Convergence of Newton-MR under inexact Hessian information (2021)
  4. Montiel, Jacob; Halford, Max; Mastelini, Saulo Martiello; Bolmier, Geoffrey; Sourty, Raphael; Vaysse, Robin; Zouitine, Adil; Gomes, Heitor Murilo; Read, Jesse; Abdessalem, Talel; Bifet, Albert: River: machine learning for streaming data in Python (2021)
  5. Nigri, Andrea; Levantesi, Susanna; Marino, Mario: Life expectancy and lifespan disparity forecasting: a long short-term memory approach (2021)
  6. Ogihara, Teppei: Misspecified diffusion models with high-frequency observations and an application to neural networks (2021)
  7. Prazeres, Mariana; Oberman, Adam M.: Stochastic gradient descent with Polyak’s learning rate (2021)
  8. Yin, Yongjing; Lai, Shaopeng; Song, Linfeng; Zhou, Chulun; Han, Xianpei; Yao, Junfeng; Su, Jinsong: An external knowledge enhanced graph-based neural network for sentence ordering (2021)
  9. Baladram, Mohammad Samy; Koike, Atsushi; Yamada, Kazunori D.: Introduction to supervised machine learning for data science (2020)
  10. Burkhart, Michael C.; Brandman, David M.; Franco, Brian; Hochberg, Leigh R.; Harrison, Matthew T.: The discriminative Kalman filter for Bayesian filtering with nonlinear and nongaussian observation models (2020)
  11. Chung, Julianne; Chung, Matthias; Tanner Slagel, J.; Tenorio, Luis: Sampled limited memory methods for massive linear inverse problems (2020)
  12. Da Silva, Andre Belotto; Gazeau, Maxime: A general system of differential equations to model first-order adaptive algorithms (2020)
  13. De, Subhayan; Maute, Kurt; Doostan, Alireza: Bi-fidelity stochastic gradient descent for structural optimization under uncertainty (2020)
  14. Do, Dieu T. T.; Nguyen-Xuan, H.; Lee, Jaehong: Material optimization of tri-directional functionally graded plates by using deep neural network and isogeometric multimesh design approach (2020)
  15. Erway, Jennifer B.; Griffin, Joshua; Marcia, Roummel F.; Omheni, Riadh: Trust-region algorithms for training responses: machine learning methods using indefinite Hessian approximations (2020)
  16. Geng, Zhenglin; Johnson, Daniel; Fedkiw, Ronald: Coercing machine learning to output physically accurate results (2020)
  17. Göttlich, Simone; Knapp, Stephan: Artificial neural networks for the estimation of pedestrian interaction forces (2020)
  18. Henderson, Donna; Lunter, Gerton: Efficient inference in state-space models through adaptive learning in online Monte Carlo expectation maximization (2020)
  19. Karumuri, Sharmila; Tripathy, Rohit; Bilionis, Ilias; Panchal, Jitesh: Simulator-free solution of high-dimensional stochastic elliptic partial differential equations using deep neural networks (2020)
  20. Kylasa, Sudhir; Fang, Chih-Hao; Roosta, Fred; Grama, Ananth: Parallel optimization techniques for machine learning (2020)

1 2 3 next