DiSCO

DiSCO: Distributed Optimization for Self-Concordant Empirical Loss. We propose a new distributed algorithm for empirical risk minimization in machine learning. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, where the n data points are i.i.d. sampled and when the regularization parameter scales as 1/sqrtn, we show that the proposed algorithm is communication efficient: the required round of communication does not increase with the sample size n, and only grows slowly with the number of machines.


References in zbMATH (referenced in 12 articles )

Showing results 1 to 12 of 12.
Sorted by year (citations)

  1. Zhang, Jiaqi; You, Keyou; Ba┼čar, Tamer: Distributed adaptive Newton methods with global superlinear convergence (2022)
  2. Ostrovskii, Dmitrii M.; Bach, Francis: Finite-sample analysis of (M)-estimators using self-concordance (2021)
  3. Lee, Ching-pei; Chang, Kai-Wei: Distributed block-diagonal approximation methods for regularized empirical risk minimization (2020)
  4. Richards, Dominic; Rebeschini, Patrick: Graph-dependent implicit regularisation for distributed stochastic subgradient descent (2020)
  5. Sun, Tianxiao; Necoara, Ion; Tran-Dinh, Quoc: Composite convex optimization with global and local inexact oracles (2020)
  6. Yuan, Xiao-Tong; Li, Ping: On convergence of distributed approximate Newton methods: globalization, sharper bounds and beyond (2020)
  7. Jordan, Michael I.; Lee, Jason D.; Yang, Yun: Communication-efficient distributed statistical inference (2019)
  8. Sun, Tianxiao; Quoc, Tran-Dinh: Generalized self-concordant functions: a recipe for Newton-type methods (2019)
  9. Xiao, Lin; Yu, Adams Wei; Lin, Qihang; Chen, Weizhu: DSCOVR: randomized primal-dual block coordinate algorithms for asynchronous distributed optimization (2019)
  10. Jain, Prateek; Kakade, Sham M.; Kidambi, Rahul; Netrapalli, Praneeth; Sidford, Aaron: Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification (2018)
  11. Lee, Jason D.; Lin, Qihang; Ma, Tengyu; Yang, Tianbao: Distributed stochastic variance reduced gradient methods by sampling extra data with replacement (2017)
  12. Zheng, Shun; Wang, Jialei; Xia, Fen; Xu, Wei; Zhang, Tong: A general distributed dual coordinate optimization framework for regularized loss minimization (2017)