kth.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Coding for Large-Scale Distributed Machine Learning
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, ACCESS Linnaeus Centre.ORCID iD: 0000-0002-5407-0835
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.ORCID iD: 0000-0002-7926-5081
2022 (English)In: Entropy, E-ISSN 1099-4300, Vol. 24, no 9, p. 1284-, article id 1284Article, review/survey (Refereed) Published
Abstract [en]

This article aims to give a comprehensive and rigorous review of the principles and recent development of coding for large-scale distributed machine learning (DML). With increasing data volumes and the pervasive deployment of sensors and computing machines, machine learning has become more distributed. Moreover, the involved computing nodes and data volumes for learning tasks have also increased significantly. For large-scale distributed learning systems, significant challenges have appeared in terms of delay, errors, efficiency, etc. To address the problems, various error-control or performance-boosting schemes have been proposed recently for different aspects, such as the duplication of computing nodes. More recently, error-control coding has been investigated for DML to improve reliability and efficiency. The benefits of coding for DML include high-efficiency, low complexity, etc. Despite the benefits and recent progress, however, there is still a lack of comprehensive survey on this topic, especially for large-scale learning. This paper seeks to introduce the theories and algorithms of coding for DML. For primal-based DML schemes, we first discuss the gradient coding with the optimal code distance. Then, we introduce random coding for gradient-based DML. For primal-dual-based DML, i.e., ADMM (alternating direction method of multipliers), we propose a separate coding method for two steps of distributed optimization. Then coding schemes for different steps are discussed. Finally, a few potential directions for future works are also given.

Place, publisher, year, edition, pages
MDPI AG , 2022. Vol. 24, no 9, p. 1284-, article id 1284
Keywords [en]
error-control coding, gradient coding, random codes, ADMM
National Category
Other Computer and Information Science
Identifiers
URN: urn:nbn:se:kth:diva-319712DOI: 10.3390/e24091284ISI: 000858228600001PubMedID: 36141170Scopus ID: 2-s2.0-85138530737OAI: oai:DiVA.org:kth-319712DiVA, id: diva2:1704535
Note

QC 20221018

Available from: 2022-10-18 Created: 2022-10-18 Last updated: 2023-03-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Xiao, MingSkoglund, Mikael

Search in DiVA

By author/editor
Xiao, MingSkoglund, Mikael
By organisation
Information Science and EngineeringACCESS Linnaeus Centre
In the same journal
Entropy
Other Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf