Variance-Reduced Stochastic Quasi-Newton Methods for Decentralized Learning
2023 (English)In: IEEE Transactions on Signal Processing, ISSN 1053-587X, E-ISSN 1941-0476, Vol. 71, p. 311-326Article in journal (Refereed) Published
Abstract [en]
In this work, we investigate stochastic quasi-Newton methods for minimizing a finite sum of cost functions over a decentralized network. We first develop a general algorithmic framework, in which each node constructs a local, inexact quasi-Newton direction that asymptotically approaches the global, exact one at each time step. To do so, a local gradient approximation is constructed using dynamic average consensus to track the average of variance-reduced local stochastic gradients over the entire network, followed by a proper local Hessian inverse approximation. We show that under standard convexity and smoothness assumptions on the cost functions, the methods obtained from our framework converge linearly to the optimal solution if the local Hessian inverse approximations used have uniformly bounded positive eigenvalues. To construct the Hessian inverse approximations with the said boundedness property, we design two fully decentralized stochastic quasi-Newton methods-namely, the damped regularized limited-memory DFP (Davidon-Fletcher-Powell) and the damped limited-memory BFGS (Broyden-Fletcher-Goldfarb-Shanno)-which use a fixed moving window of past local gradient approximations and local decision variables to adaptively construct Hessian inverse approximations. A noteworthy feature of these methods is that they do not require extra sampling or communication. Numerical results show that the proposed decentralized stochastic quasi-Newton methods are much faster than the existing decentralized stochastic first-order algorithms.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2023. Vol. 71, p. 311-326
Keywords [en]
Costs, Convergence, Signal processing algorithms, Approximation algorithms, Cost function, Newton method, Computational complexity, Decentralized optimization, stochastic quasi-Newton methods, variance reduction, damped limited-memory DFP, damped limited-memory BFGS
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-325198DOI: 10.1109/TSP.2023.3240652ISI: 000944455200002Scopus ID: 2-s2.0-85148468536OAI: oai:DiVA.org:kth-325198DiVA, id: diva2:1748283
Note
QC 20231122
2023-04-032023-04-032023-11-22Bibliographically approved