kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (9 of 9) Show all publications
Lei, W., Fuster-Barcelo, C., Reder, G., Munoz-Barrutia, A. & Ouyang, W. (2024). BioImage.IO Chatbot: a community-driven AI assistant for integrative computational bioimaging [Letter to the editor]. Nature Methods, 21(8)
Open this publication in new window or tab >>BioImage.IO Chatbot: a community-driven AI assistant for integrative computational bioimaging
Show others...
2024 (English)In: Nature Methods, ISSN 1548-7091, E-ISSN 1548-7105, Vol. 21, no 8Article in journal, Letter (Refereed) Published
Place, publisher, year, edition, pages
Springer Nature, 2024
National Category
Human Computer Interaction
Identifiers
urn:nbn:se:kth:diva-353008 (URN)10.1038/s41592-024-02370-y (DOI)001297657900018 ()39122937 (PubMedID)2-s2.0-85200738078 (Scopus ID)
Note

QC 20240911

Available from: 2024-09-11 Created: 2024-09-11 Last updated: 2024-09-11Bibliographically approved
Lei, W. (2022). A study of wireless communications with reinforcement learning. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>A study of wireless communications with reinforcement learning
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

 The explosive proliferation of mobile users and wireless data traffic in recent years pose imminent challenges upon wireless system design. The trendfor wireless communications becoming more complicated, decentralized andintelligent is inevitable. Lots of key issues in this field are decision-makingrelated problems such as resource allocation, transmission control, intelligentbeam tracking in millimeter Wave (mmWave) systems and so on. Reinforcement learning (RL) was once a languishing field of AI for solving varioussequential decision-making problems. However, it got revived in the late 80sand early 90s when it was connected to dynamic programming (DP). Then,recently RL has progressed in many applications, especially when underliningmodels do not have explicit mathematical solutions and simulations must beused. For instance, the success of RL in AlphaGo and AlphaZero motivatedlots of recent research activities in RL from both academia and industries.Moreover, since computation power has dramatically increased within thelast decade, the methods of simulations and online learning (planning) become feasible for implementations and deployment of RL. Despite of its potentials, the applications of RL to wireless communications are still far frommature. Therefore, it is of great interest to investigate RL-based methodsand algorithms to adapt to different wireless communication scenarios. Morespecifically, this thesis with regards to RL in wireless communications can beroughly divided into the following parts:In the first part of the thesis, we develop a framework based on deepRL (DRL) to solve the spectrum allocation problem in the emerging integrated access and backhaul (IAB) architecture with large scale deploymentand dynamic environment. We propose to use the latest DRL method by integrating an actor-critic spectrum allocation (ACSA) scheme and a deep neuralnetwork (DNN) to achieve real-time spectrum allocation in different scenarios. The proposed methods are evaluated through numerical simulations andshow promising results compared with some baseline allocation policies.In the second part of the thesis, we investigate the decentralized RL algorithms using Alternating direction method of multipliers (ADMM) in applications of Edge IoT. For RL in a decentralized setup, edge nodes (agents)connected through a communication network aim to work collaboratively tofind a policy to optimize the global reward as the sum of local rewards. However, communication costs, scalability and adaptation in complex environments with heterogeneous agents may significantly limit the performance ofdecentralized RL. ADMM has a structure that allows for decentralized implementation and has shown faster convergence than gradient-descent-basedmethods. Therefore, we propose an adaptive stochastic incremental ADMM(asI-ADMM) algorithm and apply the asI-ADMM to decentralized RL withedge computing-empowered IoT networks. We provide convergence properties for proposed algorithms by designing a Lyapunov function and prove thatthe asI-ADMM has O(1=k) + O(1=M) convergence rate where k and M are thenumber of iterations and batch samples, respectively.The third part of the thesis considers the problem of joint beam training and data transmission control of delay-sensitive communications overvimmWave channels. We formulate the problem as a constrained Markov Decision Process (MDP), which aims to minimize the cumulative energy consumption over the whole considered period of time under delay constraints.By introducing a Lagrange multiplier, we reformulate the constrained MDPto an unconstrained one. Then, we solve it using the parallel-rollout-basedRL method in a data-driven manner. Our numerical results demonstrate thatthe optimized policy obtained from parallel rollout significantly outperformsother baseline policies in both energy consumption and delay performance.The final part of the thesis is a further study of the beam tracking problem using supervised learning approach. Due to computation and delay limitation in real deployment, a light-weight algorithm is desired in the beamtracking problem in mmWave networks. We formulate the beam tracking(beam sweeping) problem as a binary-classification problem, and investigatesupervised learning methods for the solution. The methods are tested in bothsimulation scenarios, i.e., ray-tracing model, and real testing data with Ericsson over-the-air (OTA) dataset. It showed that the proposed methods cansignificantly improve cell capacity and reduce overhead consumption whenthe number of UEs increases in the network. 

Abstract [sv]

Den explosiva spridningen av mobilanvändare och trådlös datatrafik un-der de senaste åren innebär överhängande utmaningar när det gäller designav trådlösa system. Trenden att trådlös kommunikation blir mer komplice-rad, decentraliserad och intelligent är oundviklig. Många nyckelfrågor inomdetta område är beslutsfattande problem såsom resursallokering, överförings-kontroll, intelligent spårning i millimetervågsystem (mmWave) och så vidare.Förstärkningsinlärning (RL) var en gång ett försvagande område för AI underen viss tidsperiod. Den återupplivades dock i slutet av 80-talet och början av90-talet när den kopplades till dynamisk programmering (DP). Sedan har RLnyligen utvecklats i många tillämpningar, speciellt när understrykande mo-deller inte har explicita matematiska lösningar och simuleringar måste använ-das. Till exempel motiverade framgångarna för RL i Alpha Go och AlphaGoZero många nya forskningsaktiviteter i RL från både akademi och industrier.Dessutom, eftersom beräkningskraften har ökat dramatiskt under det senastedecenniet, blir metoderna för simuleringar och onlineinlärning (planering) ge-nomförbara för implementeringar och distribution av RL. Trots potentialer ärtillämpningarna av RL för trådlös kommunikation fortfarande långt ifrån mo-gen. Baserat på observationer utvecklar vi RL-metoder och algoritmer underolika scenarier för trådlös kommunikation. Mer specifikt kan denna avhand-ling med avseende på RL i trådlös kommunikation grovt delas in i följandeartiklar:I den första delen av avhandlingen utvecklar vi ett ramverk baserat pådjup förstärkningsinlärning (DRL) för att lösa spektrumallokeringsproblemeti den framväxande integrerade access- och backhaul-arkitekturen (IAB) medstorskalig utbyggnad och dynamisk miljö. Vi föreslår att man använder densenaste DRL-metoden genom att integrera ett ACSA-schema (Actor-criticspectrum allocation) och ett djupt neuralt nätverk (DNN) för att uppnå real-tidsspektrumallokering i olika scenarier. De föreslagna metoderna utvärderasgenom numeriska simuleringar och visar lovande resultat jämfört med vissabaslinjetilldelningspolicyer.I den andra delen av avhandlingen undersöker vi den decentraliserade för-stärkningsinlärningen med Alternerande riktningsmetoden för multiplikatorer(ADMM) i applikationer av Edge IoT. För RL i en decentraliserad uppställ-ning syftar kantnoder (agenter) anslutna via ett kommunikationsnätverk tillatt samarbeta för att hitta en policy för att optimera den globala belöning-en som summan av lokala belöningar. Kommunikationskostnader, skalbarhetoch anpassning i komplexa miljöer med heterogena agenter kan dock avsevärtbegränsa prestandan för decentraliserad RL. ADMM har en struktur sommöjliggör decentraliserad implementering och har visat snabbare konvergensän gradientnedstigningsbaserade metoder. Därför föreslår vi en adaptiv sto-kastisk inkrementell ADMM (asI-ADMM) algoritm och tillämpar asI-ADMMpå decentraliserad RL med edge computing-bemyndigade IoT-nätverk. Vi till-handahåller konvergensegenskaper för föreslagna algoritmer genom att desig-na en Lyapunov-funktion och bevisar att asI-ADMM har O(1/k) + O(1/M)konvergenshastighet där k och M är antalet iterationer och satsprover.

Den tredje delen av avhandlingen behandlar problemet med gemensamstrålträning och dataöverföringskontroll av fördröjningskänslig kommunika-tion över millimetervågskanaler (mmWave). Vi formulerar problemet som enbegränsad Markov-beslutsprocess (MDP), som syftar till att minimera denkumulativa energiförbrukningen under hela den betraktade tidsperioden un-der fördröjningsbegränsningar. Genom att införa en Lagrange-multiplikatoromformulerar vi den begränsade MDP till en obegränsad. Sedan löser vi detmed hjälp av parallell-utrullning-baserad förstärkningsinlärningsmetod på ettdatadrivet sätt. Våra numeriska resultat visar att den optimerade policyn somerhålls från parallell utbyggnad avsevärt överträffar andra baslinjepolicyer ibåde energiförbrukning och fördröjningsprestanda.Den sista delen av avhandlingen är en ytterligare studie av strålspårnings-problem med hjälp av ett övervakat lärande. På grund av beräknings- ochfördröjningsbegränsningar i verklig distribution, är en lättviktsalgoritm önsk-värd i strålspårningsproblem i mmWave-nätverk. Vi formulerar beam tracking(beam sweeping) problemet som ett binärt klassificeringsproblem och under-söker övervakade inlärningsmetoder för lösningen. Metoderna testas i bådesimuleringsscenariot, det vill säga ray-tracing-modellen, och riktiga testda-ta med Ericsson over-the-air (OTA) dataset. Den visade att de föreslagnametoderna avsevärt kan förbättra cellkapaciteten och minska overheadför-brukningen när antalet UE ökar i nätverket.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2022. p. 148
Series
TRITA-EECS-AVL ; 2022:26
Keywords
Reinforcement learning, wireless communications, decentralized learning, beam tracking, machine learning, Förstärkningsinlärning, trådlös kommunikation, decentrali- serad inlärning, strålspårning i mmvåg, maskininlärning
National Category
Applied Mechanics
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-312916 (URN)978-91-8040-205-7 (ISBN)
Public defence
2022-06-14, F3, Lindstedtsvägen 26, Stockholm, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20220524

Available from: 2022-05-24 Created: 2022-05-24 Last updated: 2022-09-20Bibliographically approved
Lei, W., Lu, C., Huang, Y., Rao, J., Xiao, M. & Skoglund, M. (2022). Adaptive Beam Sweeping With Supervised Learning. IEEE Wireless Communications Letters, 11(12), 2650-2654
Open this publication in new window or tab >>Adaptive Beam Sweeping With Supervised Learning
Show others...
2022 (English)In: IEEE Wireless Communications Letters, ISSN 2162-2337, E-ISSN 2162-2345, Vol. 11, no 12, p. 2650-2654Article in journal (Refereed) Published
Abstract [en]

Utilizing millimeter-wave (mmWave) frequencies for wireless communication in mobile systems is challenging since continuous tracking of the beam direction is needed. For the purpose, beam sweeping is performed periodically. Such approach can be sufficient in the initial deployment of the network when the number of users is small. However, a more efficient solution is needed when lots of users are connected to the network due to higher overhead consumption. We explore a supervised learning approach to adaptively perform beam sweeping, which has low implementation complexity and can improve cell capacity by reducing beam sweeping overhead. By formulating the beam tracking problem as a binary classification problem, we applied supervised learning methods to solve the formulated problem. The methods were tested on two scenarios: ray-tracing outdoor scenario and over-the-air (OTA) testing dataset from Ericsson. Both experimental results show that the proposed methods significantly increase cell throughput comparing with existing exhaustive sweeping and periodical sweeping strategies.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Keywords
Millimeter-wave, beam tracking, supervised learning
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-323209 (URN)10.1109/LWC.2022.3213233 (DOI)000901617600038 ()2-s2.0-85139868526 (Scopus ID)
Note

QC 20230130

Available from: 2023-01-30 Created: 2023-01-30 Last updated: 2023-03-14Bibliographically approved
Lei, W., Ye, Y., Xiao, M., Skoglund, M. & Han, Z. (2022). Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge IoT. IEEE Internet of Things Journal, 9(22), 22958-22971
Open this publication in new window or tab >>Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge IoT
Show others...
2022 (English)In: IEEE Internet of Things Journal, ISSN 2327-4662, Vol. 9, no 22, p. 22958-22971Article in journal (Refereed) Published
Abstract [en]

Edge computing provides a promising paradigm to support the implementation of Internet of Things (IoT) by offloading tasks to nearby edge nodes. Meanwhile, the increasing network size makes it impractical for centralized data processing due to limited bandwidth, and consequently a decentralized learning scheme is preferable. Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes. For RL in a decentralized setup, edge nodes (agents) connected through a communication network aim to work collaboratively to find a policy to optimize the global reward as the sum of local rewards. However, communication costs, scalability, and adaptation in complex environments with heterogeneous agents may significantly limit the performance of decentralized RL. Alternating direction method of multipliers (ADMM) has a structure that allows for decentralized implementation and has shown faster convergence than gradient descent-based methods. Therefore, we propose an adaptive stochastic incremental ADMM (asI-ADMM) algorithm and apply the asI-ADMM to decentralized RL with edge-computing-empowered IoT networks. We provide convergence properties for the proposed algorithms by designing a Lyapunov function and prove that the asI-ADMM has O(1/k) + O(1/M) convergence rate, where k and M are the number of iterations and batch samples, respectively. Then, we test our algorithm with two supervised learning problems. For performance evaluation, we simulate two applications in decentralized RL settings with homogeneous and heterogeneous agents. The experimental results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability and can well adapt to complex IoT environments. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Keywords
Communication efficiency, decentralized edge computing, reinforcement learning (RL), stochastic alternating direction method of multiplier (ADMM), Complex networks, Data handling, Decision making, Edge computing, Gradient methods, Internet of things, Job analysis, Lyapunov functions, Random processes, Reinforcement learning, Scalability, Stochastic systems, Alternating directions method of multipliers, Convergence, Decentralised, Optimisations, Reinforcement learnings, Stochastic alternating direction method of multiplier, Stochastics, Task analysis, Optimization
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-325693 (URN)10.1109/JIOT.2022.3187067 (DOI)000879049400078 ()2-s2.0-85133805601 (Scopus ID)
Note

QC 20230412

Available from: 2023-04-12 Created: 2023-04-12 Last updated: 2023-04-12Bibliographically approved
Lei, W., Zhang, D., Ye, Y. & Lu, C. (2022). Joint Beam Training and Data Transmission Control for mmWave Delay-Sensitive Communications: A Parallel Reinforcement Learning Approach. IEEE Journal of Selected Topics in Signal Processing, 16(3), 447-459
Open this publication in new window or tab >>Joint Beam Training and Data Transmission Control for mmWave Delay-Sensitive Communications: A Parallel Reinforcement Learning Approach
2022 (English)In: IEEE Journal of Selected Topics in Signal Processing, ISSN 1932-4553, Vol. 16, no 3, p. 447-459Article in journal (Refereed) Published
Abstract [en]

Future communication networks call for new solutions to support their capacity and delay demands by leveraging potentials of the millimeter wave (mmWave) frequency band. However, the beam training procedure in mmWave systems incurs significant overhead as well as huge energy consumption. As such, deriving an adaptive control policy is beneficial to both delay-sensitive and energy-efficient data transmission over mmWave networks. To this end, we investigate the problem of joint beam training and data transmission control for mmWave delay-sensitive communications in this paper. Specifically, the considered problem is firstly formulated as a constrained Markov Decision Process (MDP), which aims to minimize the cumulative energy consumption over the whole considered period of time under delay constraint. By introducing a Lagrange multiplier, we transform the constrained MDP into an unconstrained one, which is then solved via a parallel-rollout-based reinforcement learning method in a data-driven manner. Our numerical results demonstrate that the optimized policy via parallel-rollout significantly outperforms other baseline policies in both energy consumption and delay performance.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Keywords
Training, Data communication, Transmitters, Energy consumption, Delays, Array signal processing, Reinforcement learning, Beam training, data-driven, delay-sensitive, Markov decision process, millimeter wave, reinforcement learning
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-312654 (URN)10.1109/JSTSP.2022.3143488 (DOI)000797421100015 ()2-s2.0-85123368417 (Scopus ID)
Note

QC 20220530

Available from: 2022-05-19 Created: 2022-05-19 Last updated: 2022-06-25Bibliographically approved
Lei, W., Ye, Y., Xiao, M., Skoglund, M. & Han, Z. (2021). Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT.
Open this publication in new window or tab >>Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT
Show others...
2021 (English)Manuscript (preprint) (Other academic)
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-312729 (URN)
Note

QC 20220530

Available from: 2022-05-21 Created: 2022-05-21 Last updated: 2022-07-12Bibliographically approved
Lei, W., Ye, Y. & Xiao, M. (2020). Deep reinforcement learning-based spectrum allocation in integrated access and backhaul networks. IEEE Transactions on Cognitive Communications and Networking, 6(3), 970-979
Open this publication in new window or tab >>Deep reinforcement learning-based spectrum allocation in integrated access and backhaul networks
2020 (English)In: IEEE Transactions on Cognitive Communications and Networking, E-ISSN 2332-7731, Vol. 6, no 3, p. 970-979Article in journal (Refereed) Published
Abstract [en]

We develop a framework based on deep reinforcement learning (DRL) to solve the spectrum allocation problem in the emerging integrated access and backhaul (IAB) architecture with large scale deployment and dynamic environment. The available spectrum is divided into several orthogonal sub-channels, and the donor base station (DBS) and all IAB nodes have the same spectrum resource for allocation, where a DBS utilizes those sub-channels for access links of associated user equipment (UE) as well as for backhaul links of associated IAB nodes, and an IAB node can utilize all for its associated UEs. This is one of key features in which 5G differs from traditional settings where the backhaul networks are designed independently from the access networks. With the goal of maximizing the sum log-rate of all UE groups, we formulate the spectrum allocation problem into a mix-integer and non-linear programming. However, it is intractable to find an optimal solution especially when the IAB network is large and time-varying. To tackle this problem, we propose to use the latest DRL method by integrating an actor-critic spectrum allocation (ACSA) scheme and deep neural network (DNN) to achieve real-time spectrum allocation in different scenarios. The proposed methods are evaluated through numerical simulations and show promising results compared with some baseline allocation policies.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-312652 (URN)10.1109/TCCN.2020.2992628 (DOI)000568659500009 ()2-s2.0-85091582265 (Scopus ID)
Note

QC 20220530

Available from: 2022-05-19 Created: 2022-05-19 Last updated: 2023-01-25Bibliographically approved
Huang, Y., Lei, W., Lu, C. & Berg, M. (2019). Fronthaul Functional Split of IRC-Based Beamforming for Massive MIMO Systems. In: 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall): . Paper presented at 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall) (pp. 1-5).
Open this publication in new window or tab >>Fronthaul Functional Split of IRC-Based Beamforming for Massive MIMO Systems
2019 (English)In: 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall), 2019, p. 1-5Conference paper, Published paper (Refereed)
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-312653 (URN)
Conference
2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall)
Note

QC 20220621

Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2024-03-18Bibliographically approved
Lei, W., Lu, C., Huang, Y., Rao, J., Xiao, M. & Skoglund, M.Adaptive Beam Tracking With Supervised Learning.
Open this publication in new window or tab >>Adaptive Beam Tracking With Supervised Learning
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

 Utilizing millimeter-wave (mmWave) frequencies forwireless communication in mobile systems is challenging sincecontinuous tracking of the beam direction is needed. For the purpose, beam sweeping is performed periodically. Such approachcan be sufficient in the initial deployment of the network whenthe number of users is small. However, a more efficient solutionis needed when lots of users are connected to the network due tohigher overhead consumption. We explore a supervised learningapproach to adaptively perform beam sweeping, which has lowimplementation complexity and can improve cell capacity byreducing beam sweeping overhead. By formulating the beamtracking problem as a binary classification problem, we appliedsupervised learning methods to solve the formulated problem.The methods were tested on two scenarios: ray-tracing outdoorscenario and over-the-air (OTA) testing dataset from Ericsson.Both experimental results show that the proposed methodssignificantly increase cell throughput comparing with existingexhaustive sweeping and periodical sweeping strategies. 

Keywords
millimeter-wave; beam tracking; supervised learning
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-312922 (URN)
Note

QC 20220524

Submitted to IEEE Wireless Communication Letters

Available from: 2022-05-24 Created: 2022-05-24 Last updated: 2022-06-25Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-9878-3722

Search in DiVA

Show all publications