kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cooperative Gradient Coding
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.ORCID iD: 0000-0003-0930-7001
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.ORCID iD: 0000-0001-9096-8792
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.ORCID iD: 0000-0002-5407-0835
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.ORCID iD: 0000-0002-7926-5081
2025 (English)In: IEEE Transactions on Communications, ISSN 0090-6778, E-ISSN 1558-0857, Vol. 73, no 12, p. 13087-13102Article in journal (Refereed) Published
Abstract [en]

This work studies gradient coding (GC) in the context of distributed training problems with unreliable communication. We propose cooperative GC (CoGC), a novel gradient-sharing-based GC framework that leverages cooperative communication among clients. This approach eliminates the need for dataset replication, making it communication- and computation-efficient and suitable for federated learning (FL). By employing the standard GC decoding mechanism, CoGC yields strictly binary outcomes: the global model is either recovered exactly or the recovery is meaningless, with no intermediate outcomes. This characteristic ensures the optimality of the training and demonstrates strong resilience to client-to-server communication failures. However, due to the limited flexibility of the recovery outcomes, the decoding mechanism may also result in communication inefficiency and hinder convergence, especially when communication channels among clients are in poor condition. To overcome this limitation and further exploit the potential of GC matrices, we propose a complementary decoding mechanism, termed GC<sup>+</sup>, which leverages information that would otherwise be discarded during GC decoding failures. This approach significantly improves system reliability against unreliable communication, as the full recovery<sup>1</sup> of the global model dominates in GC<sup>+</sup>. To conclude, this work establishes solid theoretical frameworks for both CoGC and GC<sup>+</sup>. We assess the system reliability by outage analyses and convergence analyses for each decoding mechanism, along with a rigorous investigation of how outages affect the structure and performance of GC matrices. Finally, the effectiveness of CoGC and GC<sup>+</sup> is validated through extensive simulations.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2025. Vol. 73, no 12, p. 13087-13102
Keywords [en]
Complementary decoding mechanism, Convergence, Cooperative gradient coding, Federated learning, Secure Aggregation, Straggler mitigation, Unreliable communication
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-371622DOI: 10.1109/TCOMM.2025.3612589ISI: 001649704400032Scopus ID: 2-s2.0-105017454960OAI: oai:DiVA.org:kth-371622DiVA, id: diva2:2007048
Note

QC 20260123

Available from: 2025-10-17 Created: 2025-10-17 Last updated: 2026-01-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Weng, ShudiRen, ChaoXiao, MingSkoglund, Mikael

Search in DiVA

By author/editor
Weng, ShudiRen, ChaoXiao, MingSkoglund, Mikael
By organisation
Information Science and Engineering
In the same journal
IEEE Transactions on Communications
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 25 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf