AdaGrad

ADAGRAD: adaptive gradient algorithm; Adaptive subgradient methods for online learning and stochastic optimization. We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning. Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features. Our paradigm stems from recent advances in stochastic optimization and online learning which employ proximal functions to control the gradient steps of the algorithm. We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight. We give several efficient algorithms for empirical risk minimization problems with common and important regularization functions and domain constraints. We experimentally study our theoretical analysis and show that adaptive subgradient methods outperform state-of-the-art, yet non-adaptive, subgradient algorithms.


References in zbMATH (referenced in 124 articles , 1 standard article )

Showing results 1 to 20 of 124.
Sorted by year (citations)

1 2 3 ... 5 6 7 next

  1. Barakat, Anas; Bianchi, Pascal: Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization (2021)
  2. De Loera, Jesús A.; Haddock, Jamie; Ma, Anna; Needell, Deanna: Data-driven algorithm selection and tuning in optimization and signal processing (2021)
  3. Duchi, John C.; Ruan, Feng: Asymptotic optimality in stochastic optimization (2021)
  4. Fan, Jianqing; Ma, Cong; Zhong, Yiqiao: A selective overview of deep learning (2021)
  5. Frye, Charles G.; Simon, James; Wadia, Neha S.; Ligeralde, Andrew; Deweese, Michael R.; Bouchard, Kristofer E.: Critical point-finding methods reveal gradient-flat regions of deep network losses (2021)
  6. Haghighat, Ehsan; Raissi, Maziar; Moure, Adrian; Gomez, Hector; Juanes, Ruben: A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics (2021)
  7. Huang, Junhao; Sun, Weize; Huang, Lei: Joint structure and parameter optimization of multiobjective sparse neural network (2021)
  8. Kafka, Dominic; Wilke, Daniel N.: Resolving learning rates adaptively by locating stochastic non-negative associated gradient projection points using line searches (2021)
  9. Liu, Yang; Roosta, Fred: Convergence of Newton-MR under inexact Hessian information (2021)
  10. Ma, Chenxin; Jaggi, Martin; Curtis, Frank E.; Srebro, Nathan; Takáč, Martin: An accelerated communication-efficient primal-dual optimization framework for structured machine learning (2021)
  11. Sakai, Tomoya; Niu, Gang; Sugiyama, Masashi: Information-theoretic representation learning for positive-unlabeled classification (2021)
  12. Stordal, Andreas S.; Moraes, Rafael J.; Raanes, Patrick N.; Evensen, Geir: p-kernel Stein variational gradient descent for data assimilation and history matching (2021)
  13. Wills, Adrian G.; Schön, Thomas B.: Stochastic quasi-Newton with line-search regularisation (2021)
  14. Aggarwal, Charu C.: Linear algebra and optimization for machine learning. A textbook (2020)
  15. Akyildiz, Ömer Deniz; Crisan, Dan; Míguez, Joaquín: Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization (2020)
  16. Boffi, Nicholas M.; Slotine, Jean-Jacques E.: A continuous-time analysis of distributed stochastic gradient (2020)
  17. Burkhart, Michael C.; Brandman, David M.; Franco, Brian; Hochberg, Leigh R.; Harrison, Matthew T.: The discriminative Kalman filter for Bayesian filtering with nonlinear and Nongaussian observation models (2020)
  18. Daskalakis, Emmanouil; Herrmann, Felix J.; Kuske, Rachel: Accelerating sparse recovery by reducing chatter (2020)
  19. De, Subhayan; Maute, Kurt; Doostan, Alireza: Bi-fidelity stochastic gradient descent for structural optimization under uncertainty (2020)
  20. Duan, Jia; Zhou, Jiantao; Li, Yuanman: Privacy-preserving distributed deep learning based on secret sharing (2020)

1 2 3 ... 5 6 7 next