kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Primal-Dual SGD Algorithm for Distributed Nonconvex Optimization
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control). Digital Futures, S-10044 Stockholm, Sweden..ORCID iD: 0000-0003-4299-0471
Univ North Texas, Dept Elect Engn, Denton, TX 76203 USA..
Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China..
Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China..
Show others and affiliations
2022 (English)In: IEEE-CAA JOURNAL OF AUTOMATICA SINICA, ISSN 2329-9266, Vol. 9, no 5, p. 812-833Article in journal (Refereed) Published
Abstract [en]

The distributed nonconvex optimization problem of minimizing a global cost function formed by a sum of n local cost functions by using local information exchange is considered. This problem is an important component of many machine learning techniques with data parallelism, such as deep learning and federated learning. We propose a distributed primal-dual stochastic gradient descent (SGD) algorithm, suitable for arbitrarily connected communication networks and any smooth (possibly nonconvex) cost functions. We show that the proposed algorithm achieves the linear speedup convergence rate O(1/root nT) for general nonconvex cost functions and the linear speedup convergence rate O(1/nT)) when the global cost function satisfies the Polyak-Lojasiewicz (P-L) condition, where T is the total number of iterations. We also show that the output of the proposed algorithm with constant parameters linearly converges to a neighborhood of a global optimum. We demonstrate through numerical experiments the efficiency of our algorithm in comparison with the baseline centralized SGD and recently proposed distributed SGD algorithms.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2022. Vol. 9, no 5, p. 812-833
Keywords [en]
Distributed nonconvex optimization, linear speedup, Polyak-Lojasiewicz (P-L) condition, primal-dual algorithm, stochastic gradient descent
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-313517DOI: 10.1109/JAS.2022.105554ISI: 000794202700009Scopus ID: 2-s2.0-85129516654OAI: oai:DiVA.org:kth-313517DiVA, id: diva2:1665268
Note

QC 20220607

Available from: 2022-06-07 Created: 2022-06-07 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Yi, XinleiJohansson, Karl Henrik

Search in DiVA

By author/editor
Yi, XinleiJohansson, Karl Henrik
By organisation
Decision and Control Systems (Automatic Control)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 59 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf