kth.sePublications KTH
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
Link to record
Permanent link

Direct link
Publications (9 of 9) Show all publications
Li, C., Xiao, M. & Skoglund, M. (2025). Coded Robust Aggregation for Distributed Learning under Byzantine Attacks. IEEE Transactions on Information Forensics and Security, 20, 11636-11651
Open this publication in new window or tab >>Coded Robust Aggregation for Distributed Learning under Byzantine Attacks
2025 (English)In: IEEE Transactions on Information Forensics and Security, ISSN 1556-6013, E-ISSN 1556-6021, Vol. 20, p. 11636-11651Article in journal (Refereed) Published
Abstract [en]

In this paper, we investigate the problem of distributed learning (DL) in the presence of Byzantine attacks. For this problem, various robust bounded aggregation (RBA) rules have been proposed at the central server to mitigate the impact of Byzantine attacks. However, current DL methods apply RBA rules for the local gradients from the honest devices and the disruptive information from Byzantine devices, and the learning performance degrades significantly when the local gradients of different devices vary considerably from each other. To overcome this limitation, we propose a new DL method to cope with Byzantine attacks based on coded robust aggregation (CRA-DL). Before training begins, the training data are allocated to the devices redundantly. During training, in each iteration, the honest devices transmit coded gradients to the server computed from the allocated training data, and the server then aggregates the information received from both honest and Byzantine devices using RBA rules. In this way, the global gradient can be approximately recovered at the server to update the global model. Compared with current DL methods applying RBA rules, the improvement of CRA-DL is attributed to the fact that the coded gradients sent by the honest devices are closer to each other. This closeness enhances the robustness of the aggregation against Byzantine attacks, since Byzantine messages tend to be significantly different from those of honest devices in this case. We theoretically analyze the convergence performance of CRA-DL. Finally, we present numerical results to verify the superiority of the proposed method over existing baselines, showing its enhanced learning performance under Byzantine attacks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Byzantine attacks, convergence analysis, distributed learning, gradient coding, robust aggregation
National Category
Signal Processing
Identifiers
urn:nbn:se:kth:diva-372561 (URN)10.1109/TIFS.2025.3624620 (DOI)001606641400002 ()2-s2.0-105019656740 (Scopus ID)
Note

QC 20251111

Available from: 2025-11-11 Created: 2025-11-11 Last updated: 2025-11-11Bibliographically approved
Li, C., Xiao, M. & Skoglund, M. (2025). Communication-Efficient Semi-Decentralized Federated Learning in the Presence of Stragglers. IEEE Transactions on Communications, 73(12), 13999-14013
Open this publication in new window or tab >>Communication-Efficient Semi-Decentralized Federated Learning in the Presence of Stragglers
2025 (English)In: IEEE Transactions on Communications, ISSN 0090-6778, E-ISSN 1558-0857, Vol. 73, no 12, p. 13999-14013Article in journal (Refereed) Published
Abstract [en]

In this paper, we consider the problem of federated learning (FL) with devices that have intermittent connectivity to the central server. For this problem, the concept of semi-decentralized FL has been proposed in the literature. This paradigm allows non-straggler devices to relay the gradients computed by the stragglers to the server, and enables realization of gradient coding (GC) to mitigate the negative impact of the stragglers that fail to communicate directly to the central server. However, for GC in semi-decentralized FL, the communication overhead caused by information transmission among the devices is significant. To overcome this shortcoming, inspired by the existing communication-optimal exact consensus algorithm (CECA), we propose a new communication-efficient semi-decentralized FL method (COFFEE). In each round, the devices exchange information by taking a certain number of steps towards communication-optimal exact consensus, ensuring that each device obtains the average of the gradients computed by both its previous neighbors and itself. Afterwards, the non-stragglers transmit the local average result to the server for global aggregation to update the global model. We analyze the convergence performance and the communication overhead of COFFEE analytically. Building on this, to further enhance learning performance under a specific communication overhead, we propose an enhanced version of COFFEE with an adaptive aggregation rule at the central server, referred to as A-COFFEE, which adjusts to the straggler pattern of the devices over training rounds. Experiments are conducted to verify that the proposed methods outperform the baseline methods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
communication efficiency, federated learning, intermittent connectivity, stragglers
National Category
Control Engineering Communication Systems
Identifiers
urn:nbn:se:kth:diva-370075 (URN)10.1109/TCOMM.2025.3605479 (DOI)2-s2.0-105015207710 (Scopus ID)
Note

QC 20250922

Available from: 2025-09-22 Created: 2025-09-22 Last updated: 2025-12-30Bibliographically approved
Li, C., Xiao, M. & Skoglund, M. (2025). Sign-Based Distributed Learning with Byzantine Resilience based on Audit Mechanism. IEEE Transactions on Information Forensics and Security, 20, 5774-5788
Open this publication in new window or tab >>Sign-Based Distributed Learning with Byzantine Resilience based on Audit Mechanism
2025 (English)In: IEEE Transactions on Information Forensics and Security, ISSN 1556-6013, E-ISSN 1556-6021, Vol. 20, p. 5774-5788Article in journal (Refereed) Published
Abstract [en]

In this paper, we study the problem of distributed learning (DL) with devices transmitting sign information of the local gradients to the server under communication constraints, where the devices are susceptible to Byzantine attacks. For this problem, a sign-based gradient descent method with majority vote and stochastic 1-bit quantization (Sign-M-stochastic) has been proposed very recently. However, the Byzantine resilience of Sign-M-stochastic is inherently limited, based on the fact that all Byzantine devices and honest devices participate equally in the training process. To overcome this drawback and enhance the resilience to Byzantine attacks, inspired by the audit-based distributed detection systems, we propose a novel DL method with an audit mechanism (DL-AM). In each iteration, the sign information of the local gradients are obtained by the devices from stochastic 1-bit quantization. All devices, partitioned into groups, send the sign information to the server through multiple paths, both directly and via other devices in the same group. This approach provides the server with additional information about the identities of the devices, which enables the server to form the global model update by aggregating the sign information of different devices with varying weights. We analyze the convergence performance of the proposed method from a theoretical perspective. Finally, numerical results demonstrate the superiority of DL-AM over the baseline methods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Audit mechanism, Byzantine resilience, convergence analysis, distributed learning, sign information
National Category
Signal Processing
Identifiers
urn:nbn:se:kth:diva-364453 (URN)10.1109/TIFS.2025.3575582 (DOI)001511714600003 ()2-s2.0-105007298930 (Scopus ID)
Note

QC 20260128

Available from: 2025-06-12 Created: 2025-06-12 Last updated: 2026-01-28Bibliographically approved
Li, C., Xiao, M. & Skoglund, M. (2024). A Communication-Efficient Semi-Decentralized Approach for Federated Learning with Stragglers. In: 2024 IEEE Information Theory Workshop, ITW 2024: . Paper presented at 2024 IEEE Information Theory Workshop, ITW 2024, Shenzhen, China, November 24-28, 2024 (pp. 229-234). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>A Communication-Efficient Semi-Decentralized Approach for Federated Learning with Stragglers
2024 (English)In: 2024 IEEE Information Theory Workshop, ITW 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 229-234Conference paper, Published paper (Refereed)
Abstract [en]

We study the problem of federated learning (FL) in the presence of stragglers, the devices that are intermittently connected to the central server. Although under the newly developed semi-decentralized federated learning (SFL) framework, gradient coding (GC) can be applied to evade the stragglers by letting them relay their locally computed gradients to the central server via non-stragglers, the communication burden of GC in SFL is very heavy. To overcome this drawback, motivated by the communication-optimal exact consensus algorithm (CECA) proposed in the literature, we propose a new communicationefficient semi-decentralized method (COFFEE) in SFL. In each round of COFFEE, the devices take a certain number of steps towards consensus in a decentralized manner with high communication efficiency, and each of them acquires the average of its own gradient and the gradients of its previous neighbors. After that, the non-straggler devices send the obtained average results to the server, which aggregates the received vectors to yield the global model update. The learning performance of the proposed method is analyzed through convergence analysis. Finally, we run simulations to show the superiority of COFFEE over the baseline method, i.e., GC in SFL.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
communication efficiency, semi-decentralized federated learning, stragglers
National Category
Control Engineering Communication Systems
Identifiers
urn:nbn:se:kth:diva-359870 (URN)10.1109/ITW61385.2024.10807022 (DOI)001433908800039 ()2-s2.0-85216535438 (Scopus ID)
Conference
2024 IEEE Information Theory Workshop, ITW 2024, Shenzhen, China, November 24-28, 2024
Note

Part of ISBN 9798350348934

QC 20250213

Available from: 2025-02-12 Created: 2025-02-12 Last updated: 2025-04-30Bibliographically approved
Weng, S., Li, C., Xiao, M. & Skoglund, M. (2024). Cooperative Gradient Coding for Semi-Decentralized Federated Learning. In: GLOBECOM 2024 - 2024 IEEE Global Communications Conference: . Paper presented at 2024 IEEE Global Communications Conference, GLOBECOM 2024, Cape Town, South Africa, Dec 8 2024 - Dec 12 2024 (pp. 199-204). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Cooperative Gradient Coding for Semi-Decentralized Federated Learning
2024 (English)In: GLOBECOM 2024 - 2024 IEEE Global Communications Conference, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 199-204Conference paper, Published paper (Refereed)
Abstract [en]

Stragglers' effects are known to degrade FL performance. In this paper, we investigate federated learning (FL) over wireless networks in the presence of communication stragglers, where the power-constrained clients collaboratively train a global model by iteratively optimizing a local objective function with their local datasets and transmitting local model updates to the central parameter server (PS) through fading channels. To tackle communication stragglers without dataset sharing or prior information about the network at PS, we propose cooperative gradient coding (CoGC) for semi-decentralized FL to enable the exact global model recovery at PS. Furthermore, we conduct a thorough theoretical analysis of the proposed approach. Namely, an outage analysis of the proposed approach is provided, followed by a convergence analysis based on the failure probability of the global model recovery at PS. Nevertheless, simulation results reveal the superiority of the proposed approach in the presence of stragglers under imbalanced data distribution.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
communication stragglers, convergence, Federated learning, gradient coding, outages, semi-decentralized network
National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-361981 (URN)10.1109/GLOBECOM52923.2024.10901785 (DOI)001511158700034 ()2-s2.0-105000827333 (Scopus ID)
Conference
2024 IEEE Global Communications Conference, GLOBECOM 2024, Cape Town, South Africa, Dec 8 2024 - Dec 12 2024
Note

Part of ISBN 9798350351255

QC 20250409

Available from: 2025-04-03 Created: 2025-04-03 Last updated: 2025-12-08Bibliographically approved
Li, C. & Skoglund, M. (2024). Decentralized Learning Based on Gradient Coding With Compressed Communication. IEEE Transactions on Signal Processing, 72, 4713-4729
Open this publication in new window or tab >>Decentralized Learning Based on Gradient Coding With Compressed Communication
2024 (English)In: IEEE Transactions on Signal Processing, ISSN 1053-587X, E-ISSN 1941-0476, Vol. 72, p. 4713-4729Article in journal (Refereed) Published
Abstract [en]

This paper considers the problem of decentralized learning (DEL) with stragglers under the communication bottleneck. In the literature, various gradient coding techniques have been proposed for distributed learning with stragglers by letting the devices transmit encoded gradients based on redundant training data. However, those techniques can not be directly applied to fully decentralized scenarios as considered in this paper due to the lack of a global model in DEL. To overcome this shortcoming, we first propose a new gossip-based DEL method with gradient coding (GOCO). In GOCO, to mitigate the negative impact of stragglers, the devices update the parameter vectors with encoded gradients based on stochastic gradient coding before averaging in a gossip-based manner. To further reduce the communication overhead associated with GOCO, we propose an enhanced version of GOCO, namely GOCO with compressed communication (2-GOCO), where the devices transmit compressed messages instead of the raw parameter vectors. The convergence of the proposed methods is analyzed for strongly convex loss functions. Simulation results demonstrate that the proposed methods outperform the baseline methods, which attain better learning performance under the same communication overhead.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Communication bottleneck, compression, decentralized learning, gradient coding, stragglers
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-356468 (URN)10.1109/TSP.2024.3467262 (DOI)001342545100006 ()2-s2.0-85206088001 (Scopus ID)
Note

QC 20241119

Available from: 2024-11-19 Created: 2024-11-19 Last updated: 2024-11-19Bibliographically approved
Li, C. & Skoglund, M. (2024). Distributed Learning Based on 1-Bit Gradient Coding in the Presence of Stragglers. IEEE Transactions on Communications, 72(8), 4903-4916
Open this publication in new window or tab >>Distributed Learning Based on 1-Bit Gradient Coding in the Presence of Stragglers
2024 (English)In: IEEE Transactions on Communications, ISSN 0090-6778, E-ISSN 1558-0857, Vol. 72, no 8, p. 4903-4916Article in journal (Refereed) Published
Abstract [en]

This paper considers the problem of distributed learning (DL) in the presence of stragglers. For this problem, DL methods based on gradient coding have been widely investigated, which redundantly distribute the training data to the workers to guarantee convergence when some workers are stragglers. However, these methods require the workers to transmit real-valued vectors during the process of learning, which induces very high communication burden. To overcome this drawback, we propose a novel DL method based on 1-bit gradient coding (1-bit GC-DL), where 1-bit data encoded from the locally computed gradients are transmitted by the workers to reduce the communication overhead. We theoretically provide the convergence guarantees of the proposed method for both the convex loss functions and non-convex loss functions. It is shown empirically that 1-bit GC-DL outperforms the baseline methods, which attains better learning performance under the same communication overhead.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Vectors, Quantization (signal), Convergence, Training data, Training, Encoding, Costs, Distributed learning, 1-bit quantization, stragglers, communication overhead, convergence analysis
National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-354084 (URN)10.1109/TCOMM.2024.3377715 (DOI)001294594400036 ()2-s2.0-85188541783 (Scopus ID)
Note

QC 20241004

Available from: 2024-10-04 Created: 2024-10-04 Last updated: 2024-10-04Bibliographically approved
Li, C. & Skoglund, M. (2024). Gradient Coding in Decentralized Learning for Evading Stragglers. In: 32nd european signal processing conference, EUSIPCO 2024: . Paper presented at 32nd European Signal Processing Conference (EUSIPCO), Lyon, France, August 26-30, 2024 (pp. 1821-1825). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Gradient Coding in Decentralized Learning for Evading Stragglers
2024 (English)In: 32nd european signal processing conference, EUSIPCO 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 1821-1825Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we consider a decentralized learning problem in the presence of stragglers. Although gradient coding techniques have been developed for distributed learning to evade stragglers, where the devices send encoded gradients with redundant training data, it is difficult to apply those techniques directly to decentralized learning scenarios. To deal with this problem, we propose a new gossip-based decentralized learning method with gradient coding (GOCO). In the proposed method, to avoid the negative impact of stragglers, the parameter vectors are updated locally using encoded gradients based on the framework of stochastic gradient coding and then averaged in a gossip-based manner. We analyze the convergence performance of GOCO for strongly convex loss functions. And we also provide simulation results to demonstrate the superiority of the proposed method in terms of learning performance compared with the baseline methods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
European Signal Processing Conference, ISSN 2076-1465
Keywords
decentralized learning, gradient coding, stragglers
National Category
Telecommunications Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-358609 (URN)10.23919/EUSIPCO63174.2024.10715396 (DOI)001349787000365 ()2-s2.0-85208082871 (Scopus ID)
Conference
32nd European Signal Processing Conference (EUSIPCO), Lyon, France, August 26-30, 2024
Note

Part of ISBN 978-9-4645-9361-7, 979-8-3315-1977-3

QC 20250120

Available from: 2025-01-20 Created: 2025-01-20 Last updated: 2025-01-20Bibliographically approved
Wang, Z., Wang, X., Li, G. & Li, C. (2024). Robust cross-modal remote sensing image retrieval via maximal correlation augmentation. IEEE Transactions on Geoscience and Remote Sensing, 62, Article ID 4705517.
Open this publication in new window or tab >>Robust cross-modal remote sensing image retrieval via maximal correlation augmentation
2024 (English)In: IEEE Transactions on Geoscience and Remote Sensing, ISSN 0196-2892, E-ISSN 1558-0644, Vol. 62, article id 4705517Article in journal (Refereed) Published
Abstract [en]

Most of the existing studies regarding cross-modal content-based remote sensing image retrieval (CM-CBRSIR) focus on reducing/enlarging the Euclidean distances of cross-modal (CM) data with the same/different content in a common feature space. The advantages of using Euclidean distance lie in its straightforwardness. However, the Euclidean distances of CM data features are sensitive to the outlier data and may lead to non-robust retrieval performance, particularly in the case of noisy images with low quality. To address this issue, we propose a robust Hirschfeld-Gebelein-Rényi maximal correlation (HGRMC) augmented algorithm for CM-CBRSIR in this work, named by HGRMC augmented CM-CBRSIR (HAC). In HAC, not only the projected features of CM data in Euclidean distance space but also maximal correlation information of HGRMC are learned during the training phase of the retrieval model, where HGRMC is additionally used to capture the statistical dependency between CM data to enhance the retrieval performance with the strongly noisy input data. In the retrieval phase, we also develop a fusion scheme based on the Dempster-Shafer (DS) evidence theory to combine the superiorities of Euclidean distance and HGRMC correlation criteria. Extensive experimental results demonstrate that our proposed HAC algorithm provides better and more robust retrieval performance in comparison with existing state-of-the-art CM-CBRSIR methods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Cross-modal image retrieval, Dempster-Shafer (DS) evidence theory, Hirschfeld-Gebelein-R & eacute, nyi maximal correlation (HGRMC), remote sensing data management
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-354207 (URN)10.1109/TGRS.2024.3406606 (DOI)001294607700010 ()2-s2.0-85194845008 (Scopus ID)
Note

QC 20241002

Available from: 2024-10-02 Created: 2024-10-02 Last updated: 2024-10-02Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-1649-1943

Search in DiVA

Show all publications