Adam: A Method for Stochastic Optimization. We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

References in zbMATH (referenced in 964 articles )

Showing results 1 to 20 of 964.
Sorted by year (citations)

1 2 3 ... 47 48 49 next

  1. Abueidda, Diab W.; Koric, Seid; Al-Rub, Rashid Abu; Parrott, Corey M.; James, Kai A.; Sobh, Nahil A.: A deep learning energy method for hyperelasticity and viscoelasticity (2022)
  2. Amine M. Remita, Abdoulaye Baniré Diallo: EvoVGM: a Deep Variational Generative Model for Evolutionary Parameter Estimation (2022) arXiv
  3. Antonietti, P. F.; Manuzzi, E.: Refinement of polygonal grids using convolutional neural networks with applications to polygonal discontinuous Galerkin and virtual element methods (2022)
  4. Arij Bouazizi, Adrian Holzbock, Ulrich Kressel, Klaus Dietmayer, Vasileios Belagiannis: MotionMixer: MLP-based 3D Human Body Pose Forecasting (2022) arXiv
  5. Ascione, Giacomo; Cuomo, Salvatore: A sojourn-based approach to semi-Markov reinforcement learning (2022)
  6. Ashmore, Anthony; Deen, Rehan; He, Yang-Hui; Ovrut, Burt A.: Machine learning line bundle connections (2022)
  7. Badreddine, Samy; d’Avila Garcez, Artur; Serafini, Luciano; Spranger, Michael: Logic tensor networks (2022)
  8. Bai, Jinshuai; Zhou, Ying; Ma, Yuwei; Jeong, Hyogu; Zhan, Haifei; Rathnayaka, Charith; Sauret, Emilie; Gu, Yuantong: A general neural particle method for hydrodynamics modeling (2022)
  9. Bai, Xiao-Dong; Zhang, Wei: Machine learning for vortex induced vibration in turbulent flow (2022)
  10. Bao, Jiakang; He, Yang-Hui; Hirst, Edward; Hofscheier, Johannes; Kasprzyk, Alexander; Majumder, Suvajit: Hilbert series, machine learning, and applications to physics (2022)
  11. Basir, Shamsulhaq; Senocak, Inanc: Physics and equality constrained artificial neural networks: application to forward and inverse problems with multi-fidelity data fusion (2022)
  12. Benamou, Jean-David; Chazareix, Guillaume; IJzerman, Wilbert; Rukhaia, Giorgi: Point source regularization of the finite source reflector problem (2022)
  13. Benatti, Simone; Young, Aaron; Elmquist, Asher; Taves, Jay; Tasora, Alessandro; Serban, Radu; Negrut, Dan: End-to-end learning for off-road terrain navigation using the chrono open-source simulation platform (2022)
  14. Bergner, Yoav; Halpin, Peter; Vie, Jill-Jênn: Multidimensional item response theory in the style of collaborative filtering (2022)
  15. Berk Wheelock, Lauren; Pachamanova, Dessislava A.: Acceptable set topic modeling (2022)
  16. Berrone, Stefano; Canuto, Claudio; Pintore, Moreno: Variational physics informed neural networks: the role of quadratures and test functions (2022)
  17. Bezgin, Deniz A.; Schmidt, Steffen J.; Adams, Nikolaus A.: WENO3-NN: a maximum-order three-point data-driven weighted essentially non-oscillatory scheme (2022)
  18. Bihlo, Alex; Popovych, Roman O.: Physics-informed neural networks for the shallow-water equations on the sphere (2022)
  19. Black, Nolan; Najafi, Ahmad R.: Learning finite element convergence with the multi-fidelity graph neural network (2022)
  20. Blier-Wong, Christopher; Cossette, Hélène; Lamontagne, Luc; Marceau, Etienne: Geographic ratemaking with spatial embeddings (2022)

1 2 3 ... 47 48 49 next