kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Decentralized Learning of Randomization-based Neural Networks
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.ORCID iD: 0000-0003-4406-536x
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Machine learning and artificial intelligence have been wildly explored and developed very fast to adapt to the expanding need for almost every aspect of human development. When stepping into the big data era, siloed data localization has become a big challenge for machine learning. Restricted by scattered locations and privacy regulations of information sharing, recent studies aim to develop collaborated machine learning techniques for local models to approximate the centralized performance without sharing real data. Privacy preservation is as important as the model performance and the model complexity. This thesis aims to investigate the scopes of the low computational complexity learning model, randomization-based feed-forward neural networks (RFNs). As a class of artificial neural networks (ANNs), RFNs enjoy the favorable balance between low computational complexity and satisfying performance, especially for non-image data. Driven by the advantages of RFNs and the need for distributed learning resolutions, we aim to study the potential and applicability of RFNs and distributed optimization methods that may lead to the design of the decentralized variant of RFNs to deliver desired results.

Firstly, we provide the decentralized learning algorithms based on RFN architectures for undirected network topology using synchronous communication. We investigate decentralized learning of five RFNs that provides centralized equivalent performance as if the total training data samples are available at a single node. Two of the five neural networks are shallow, and the others are deep. Experiments with nine benchmark datasets show that the five neural networks provide good performance while requiring low computational and communication complexity for decentralized learning. 

Then we are motivated to design an asynchronous decentralized learning application that achieves centralized equivalent performance with low computational complexity and communication overhead. We propose an asynchronous decentralized learning algorithm using ARock-based ADMM to realize the decentralized variants of a variety of RFNs. The proposed algorithm enables single node activation and one-sided communication in an undirected communication network, characterized by a doubly-stochastic network policy matrix. Besides, the proposed algorithm obtains the centralized solution with reduced computational cost and improved communication efficiency. 

Finally, We consider the problem of training a neural net over a decentralized scenario with a high sparsity level in connections. The issue is addressed by adapting a recently proposed incremental learning approach, called `learning without forgetting.' While an incremental learning approach assumes data availability in a sequence, nodes of the decentralized scenario can not share data between them, and there is no master node. Nodes can communicate information about model parameters among neighbors. Communication of model parameters is the key to adapt the `learning without forgetting' approach to the decentralized scenario.

Abstract [sv]

Maskininlärning och artificiell intelligens har utforskats vilt och utvecklats mycket snabbt för att anpassa sig till det växande behovet av nästan alla aspekter av mänsklig utveckling. När man går in i big data-eran har lokaliserad datalokalisering blivit en stor utmaning för maskininlärning. Begränsat av spridda platser och sekretessregler för informationsdelning, syftar nya studier till att utveckla samarbetade maskininlärningstekniker för lokala modeller för att approximera den centraliserade prestandan utan att dela verkliga data. Sekretessbevarande är lika viktigt som modellens prestanda och modellens komplexitet. Denna avhandling syftar till att undersöka omfattningen av den inlärningsmodell med låg beräkningskomplexitet, randomiseringsbaserade feed-forward neurala nätverk (RFN). Som en klass av artificiella neurala nätverk (ANN) har RFN: er den gynnsamma balansen mellan låg beräkningskomplexitet och tillfredsställande prestanda, särskilt för icke-bilddata. Drivs av RFN: s fördelar och behovet av distribuerade inlärningsupplösningar, syftar vi till att studera RFN: s potential och användbarhet och distribuerade optimeringsmetoder som kan leda till utformningen av den decentraliserade varianten av RFN för att leverera önskade resultat.

För det första tillhandahåller vi de decentraliserade inlärningsalgoritmerna baserade på RFN-arkitekturer för oriktad nätverkstopologi med synkron kommunikation. Vi undersöker decentraliserad inlärning av fem RFN som ger centraliserad ekvivalent prestanda som om de totala träningsdataproverna är tillgängliga i en enda nod. Två av de fem neurala nätverken är grunda, och de andra är djupa. Experiment med nio benchmarkdatauppsättningar visar att de fem neurala nätverken ger bra prestanda samtidigt som de kräver låg beräknings- och kommunikationskomplexitet för decentraliserat lärande.

Då är vi motiverade att designa en asynkron decentraliserad inlärningsapplikation som uppnår central motsvarande prestanda med låg beräkningskomplexitet och kommunikationsomkostnader. Vi föreslår en asynkron decentraliserad inlärningsalgoritm med ARock-baserad ADMM för att förverkliga de decentraliserade varianterna av en mängd olika RFN. Den föreslagna algoritmen möjliggör aktivering av enstaka noder och ensidig kommunikation i ett oriktat kommunikationsnätverk, kännetecknat av en dubbelstokastisk nätverkspolitisk matris. Dessutom erhåller den föreslagna algoritmen den centraliserade lösningen med minskad beräkningskostnad och förbättrad kommunikationseffektivitet.

Slutligen betraktar vi problemet med att träna ett neuralt nät över ett decentraliserat scenario med hög sparsitetsnivå i anslutningar. Frågan hanteras genom att anpassa en nyligen föreslagen inkrementell inlärningsmetod, kallad 'lärande utan att glömma.' Medan en inkrementell inlärningsmetod antar datatillgänglighet i en sekvens, kan noder i det decentraliserade scenariot inte dela data mellan dem, och det finns ingen masternod. Noder kan kommunicera information om modellparametrar bland grannar. Kommunikation av modellparametrar är nyckeln till att anpassa inlärningsmetoden till det decentraliserade scenariot.

Place, publisher, year, edition, pages
Sweden: KTH Royal Institute of Technology, 2021.
Series
TRITA-EECS-AVL ; 2021:40
National Category
Communication Systems Telecommunications
Research subject
Electrical Engineering
Identifiers
URN: urn:nbn:se:kth:diva-295433ISBN: 978-91-7873-904-2 (print)OAI: oai:DiVA.org:kth-295433DiVA, id: diva2:1556176
Public defence
2021-06-11, https://kth-se.zoom.us/j/64005034683, U1, Brinellvägen 28A, Undervisningshuset, våningsplan 6, KTH Campus, Stockholm, 13:00 (English)
Opponent
Supervisors
Note

QC 20210520

Available from: 2021-05-20 Created: 2021-05-20 Last updated: 2022-07-08Bibliographically approved
List of papers
1. DISTRIBUTED LARGE NEURAL NETWORK WITH CENTRALIZED EQUIVALENCE
Open this publication in new window or tab >>DISTRIBUTED LARGE NEURAL NETWORK WITH CENTRALIZED EQUIVALENCE
2018 (English)In: 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), IEEE, 2018, p. 2976-2980Conference paper, Published paper (Refereed)
Abstract [en]

In this article, we develop a distributed algorithm for learning a large neural network that is deep and wide. We consider a scenario where the training dataset is not available in a single processing node, but distributed among several nodes. We show that a recently proposed large neural network architecture called progressive learning network (PLN) can be trained in a distributed setup with centralized equivalence. That means we would get the same result if the data be available in a single node. Using a distributed convex optimization method called alternating-direction-method-of-multipliers (ADMM), we perform training of PLN in the distributed setup.

Place, publisher, year, edition, pages
IEEE, 2018
Keywords
Distributed learning, neural networks, data parallelism, convex optimization
National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-237152 (URN)10.1109/ICASSP.2018.8462179 (DOI)000446384603029 ()2-s2.0-85054237028 (Scopus ID)
Conference
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Note

QC 20181025

Available from: 2018-10-25 Created: 2018-10-25 Last updated: 2022-06-26Bibliographically approved
2. A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning
Open this publication in new window or tab >>A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning
2020 (English)In: 2020 International joint conference on neural networks (IJCNN), IEEE , 2020Conference paper, Published paper (Refereed)
Abstract [en]

We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers). We assume the communication network between the workers is synchronized and can be modeled as a doubly-stochastic mixing matrix without having any master node. In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns. Using altemating-direction-method-of-multipliers (ADMM) along with a layer-wise convex optimization approach, we propose a decentralized learning algorithm which enjoys low computational complexity and communication cost among the workers. We show that it is possible to achieve equivalent learning performance as if the data is available in a single place. Finally, we experimentally illustrate the time complexity and convergence behavior of the algorithm.

Place, publisher, year, edition, pages
IEEE, 2020
Series
IEEE International Joint Conference on Neural Networks (IJCNN), ISSN 2161-4393
Keywords
decentralized learning, neural network, ADMM, communication network
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-292968 (URN)10.1109/IJCNN48605.2020.9206592 (DOI)000626021400002 ()2-s2.0-85093843749 (Scopus ID)
Conference
International Joint Conference on Neural Networks (IJCNN) held as part of the IEEE World Congress on Computational Intelligence (IEEE WCCI), JUL 19-24, 2020, ELECTR NETWORK
Note

QC 20210419

Available from: 2021-04-19 Created: 2021-04-19 Last updated: 2023-04-05Bibliographically approved
3. Decentralized Learning of Randomization-based Neural Networks with Centralized Equivalence
Open this publication in new window or tab >>Decentralized Learning of Randomization-based Neural Networks with Centralized Equivalence
(English)Manuscript (preprint) (Other academic)
Abstract [en]

We consider a decentralized learning problem where training data samples are distributed over agents (processing nodes) of an underlying communication network topology without any central (master) node. Due to information privacy and security issues in a decentralized setup, nodes are not allowed to share their training data with each other and only parameters of the neural network are allowed to be shared. This article investigates decentralized learning of randomization-based neural networks that provides centralized equivalent performance as if the full training data are available at a single node. We consider five randomization-based neural networks that use convex optimization for learning. Two of the five neural networks are shallow, and the others are deep. The use of convex optimization is the key to apply alternating-direction-method-of-multipliers (ADMM) with decentralized average consensus (DAC). This helps us to establish decentralized learning with centralized equivalence. For the underlying communication network topology, we use a doubly-stochastic network policy matrix and synchronous communications. Experiments with nine benchmark datasets showthat the five neural networks provide good performance while requiring low computational and communication complexity for decentralized learning.

Keywords
Randomized neural network, Distributed learning, Multi-layer feedforward neural network, Alternating direction method of multipliers
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-295430 (URN)
Note

QC 20210524

Available from: 2021-05-20 Created: 2021-05-20 Last updated: 2022-06-25Bibliographically approved
4. Asynchrounous decentralized learning of a neural network
Open this publication in new window or tab >>Asynchrounous decentralized learning of a neural network
2020 (English)In: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Institute of Electrical and Electronics Engineers (IEEE) , 2020, p. 3947-3951Conference paper, Published paper (Refereed)
Abstract [en]

In this work, we exploit an asynchronous computing framework namely ARock to learn a deep neural network called self-size estimating feedforward neural network (SSFN) in a decentralized scenario. Using this algorithm namely asynchronous decentralized SSFN (dSSFN), we provide the centralized equivalent solution under certain technical assumptions. Asynchronous dSSFN relaxes the communication bottleneck by allowing one node activation and one side communication, which reduces the communication overhead significantly, consequently increasing the learning speed. We compare asynchronous dSSFN with traditional synchronous dSSFN in the experimental results, which shows the competitive performance of asynchronous dSSFN, especially when the communication network is sparse.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020
Series
International Conference on Acoustics Speech and Signal Processing ICASSP, ISSN 1520-6149
Keywords
Asynchronous, decentralized learning, neural networks, convex optimization
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-292015 (URN)10.1109/ICASSP40776.2020.9053996 (DOI)000615970404039 ()2-s2.0-85089210003 (Scopus ID)
Conference
2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020
Note

QC 20210324

Available from: 2021-03-24 Created: 2021-03-24 Last updated: 2022-06-25Bibliographically approved
5. Asynchronous Decentralized Learning of Randomization-based Neural Networks
Open this publication in new window or tab >>Asynchronous Decentralized Learning of Randomization-based Neural Networks
2021 (English)Conference paper, Published paper (Refereed)
Abstract [en]

In a communication network, decentralized learning refers to the knowledge collaboration between the different local agents (processing nodes) to improve the local estimation performance without sharing private data. The ideal case is that the decentralized solution approximates the centralized solution, as if all the data are available at a single node, and requires low computational power and communication overhead. In this work, we propose a decentralized learning of randomization-based neural networks with asynchronous communication and achieve centralized equivalent performance. We propose an ARock-based alternating-direction-method-of-multipliers (ADMM) algorithm that enables individual node activation and one-sided communication in an undirected connected network, characterized by a doubly-stochastic network policy matrix. Besides, the proposed algorithm reduces the computational cost and communication overhead due to its asynchronous nature. We study the proposed algorithm on different randomization-based neural networks, including ELM, SSFN, RVFL, and its variants, to achieve the centralized equivalent performance under efficient computation and communication costs. We also show that the proposed asynchronous decentralized learning algorithm can outperform a synchronous learning algorithm regarding computational complexity, especially when the network connections are sparse.

Keywords
decentralized learning, neural networks, asynchronous communication, ADMM
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-295431 (URN)10.1109/IJCNN52387.2021.9533574 (DOI)000722581702035 ()2-s2.0-85116479449 (Scopus ID)
Conference
International Joint Conference on Neural Networks (IJCNN)
Note

QC 20210520

Available from: 2021-05-20 Created: 2021-05-20 Last updated: 2022-09-23Bibliographically approved
6. Asynchronous Decentralized Learning of Randomization-based Neural Networks with Centralized Equivalence
Open this publication in new window or tab >>Asynchronous Decentralized Learning of Randomization-based Neural Networks with Centralized Equivalence
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Siloed data localization has become a big challenge for machine learning. Restricted by scattered locations and privacy regulations of information sharing, recent studies aim to develop collaborated machine learning techniques for local models to approximate the centralized performance without sharing real data. In this work, we design an asynchronous decentralized learning application that achieves centralized equivalent performance with low computational complexity and communication overhead. We propose an asynchronous decentralized learning algorithm (Async-dl) using ARock-based ADMM to realize the decentralized variants of various randomizationbased feedforward neural networks. The proposed algorithm enables single node activation and one-sided communication in an undirected and weighted communication network, characterized by a doubly-stochastic network policy matrix. Besides, the proposed algorithm obtains a centralized solution with reduced computational cost and improved communication efficiency. We investigate the five scopes of randomization-based neural networks and apply Async-dl to realize their decentralized setup. The neural network architectures are extreme learning machine (ELM), random vector functional links (RVFL), deep random vector functional link (dRVFL), ensemble deep random vector functional link (edRVFL), self size-estimating feedforward neural networks (SSFN). We employ extensive experiments and show that the proposed asynchronous decentralized learning algorithm outperforms the synchronous learning algorithm regarding computational complexity and communication efficiency, reflected on the reduced training time, especially when the network connections are sparse. We also observe that the proposed algorithm is fault-tolerant that even if some communication fails, it will still converge to the centralized solution.

Keywords
decentralized learning, randomization-based neural networks, asynchronous computing, centralized equivalent performance
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-295429 (URN)
Note

QC 20210524

Available from: 2021-05-20 Created: 2021-05-20 Last updated: 2022-06-25Bibliographically approved
7. Learning without Forgetting for Decentralized Neural Nets with Low Communication Overhead
Open this publication in new window or tab >>Learning without Forgetting for Decentralized Neural Nets with Low Communication Overhead
2021 (English)In: 2020 28th European Signal Processing Conference (EUSIPCO), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 2185-2189Conference paper, Published paper (Refereed)
Abstract [en]

We consider the problem of training a neural net over a decentralized scenario with a low communication over-head. The problem is addressed by adapting a recently proposed incremental learning approach, called `learning without forgetting'. While an incremental learning approach assumes data availability in a sequence, nodes of the decentralized scenario can not share data between them and there is no master node. Nodes can communicate information about model parameters among neighbors. Communication of model parameters is the key to adapt the `learning without forgetting' approach to the decentralized scenario. We use random walk based communication to handle a highly limited communication resource.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Keywords
Decentralized learning, feedforward neural net, learning without forgetting, low communication overhead
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-295432 (URN)10.23919/Eusipco47968.2020.9287777 (DOI)000632622300440 ()2-s2.0-85099303579 (Scopus ID)
Conference
28th European Signal Processing Conference (EUSIPCO), Amsterdam
Note

QC 20210621

Available from: 2021-05-20 Created: 2021-05-20 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

fulltext(2791 kB)683 downloads
File information
File name FULLTEXT01.pdfFile size 2791 kBChecksum SHA-512
1f746864b9a8cc400ee50be09a76e6a2d02c3563a71fd58f0c4b32a2e710ea3e75b51a6ae38834ac06e46bb0ec70b6ea9d8f2123c47aa3ae4627d8a0447c64f7
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Liang, Xinyue
By organisation
Information Science and Engineering
Communication SystemsTelecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 685 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1229 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf