Endre søk
Link to record
Permanent link

Direct link
Moothedath, Vishnu NarayananORCID iD iconorcid.org/0000-0002-2739-5060
Publikasjoner (10 av 16) Visa alla publikasjoner
Moothedath, V. N., Seo, S., Petreska, N., Kloiber, B. & Gross, J. (2025). Delay Analysis of 5G HARQ in the Presence of Decoding and Feedback Latencies.
Åpne denne publikasjonen i ny fane eller vindu >>Delay Analysis of 5G HARQ in the Presence of Decoding and Feedback Latencies
Vise andre…
2025 (engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

The growing demand for stringent quality of service (QoS) guarantees in 5G networks requires accurate characterisation of delay performance, often measured using Delay Violation Probability (DVP) for a given target delay. Widely used retransmission schemes like Automatic Repeat reQuest (ARQ) and Hybrid ARQ (HARQ) improve QoS through effective feedback, incremental redundancy (IR), and parallel retransmission processes. However, existing works to quantify the DVP under these retransmission schemes overlook practical aspects such as decoding complexity, feedback delays, and the resulting need for multiple parallel ARQ/HARQ processes that enable packet transmissions without waiting for previous feedback, thus exploiting valuable transmission opportunities. This work proposes a comprehensive multi-server delay model for ARQ/HARQ that incorporates these aspects. Using a finite blocklength error model, we derive closed-form expressions and algorithms for accurate DVP evaluation under realistic 5G configurations aligned with 3GPP standards. Our numerical evaluations demonstrate notable improvements in DVP accuracy over the state-of-the-art, highlight the impact of parameter tuning and resource allocation, and reveal how DVP affects system throughput. 

Emneord
Information Theory (cs.IT), Systems and Control (eess.SY), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-369816 (URN)10.48550/ARXIV.2502.08789 (DOI)
Merknad

QC 20250917

Tilgjengelig fra: 2025-09-15 Laget: 2025-09-15 Sist oppdatert: 2025-09-17bibliografisk kontrollert
Moothedath, V. N. (2025). Towards Efficient Distributed Intelligence: Cost-Aware Sensing and Offloading for Inference at the Edge. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Åpne denne publikasjonen i ny fane eller vindu >>Towards Efficient Distributed Intelligence: Cost-Aware Sensing and Offloading for Inference at the Edge
2025 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

The ongoing proliferation of intelligent systems, driven by artificial intelligence (AI) and 6G, is leading to a surge in closed-loop inference tasks performed on distributed compute nodes.These systems operate under strict latency and energy constraints, extending the challenge beyond achieving high accuracy to enabling timely and energy-efficient inference.This thesis examines how distributed inference can be optimised through two key decisions: when to sample the environment and when to offload computation to a more accurate remote model.These decisions are guided by the semantics of the underlying environment and its associated costs.The semantics are kept abstract, and pre-trained inference models are employed, ensuring a platform-independent formulation adaptable to the rapid evolution of distributed intelligence and wireless technologies.

Regarding sampling, we studied the trade-off between sampling cost and detection delay in event-detection systems without sufficient local inference capabilities. The problem was posed as an optimisation over sampling instants under a stochastic event sequence and analysed at different levels of modelling complexity, ranging from periodic to aperiodic sampling. Closed-form, algorithmic, and approximate solutions were developed, with some results of independent mathematical interest.Simulations in realistic settings showed marked gains in efficiency over systems that neglect event semantics. In particular, aperiodic sampling achieved a stable improvement of ~10% over optimised periodic policies across parameter variations.

Regarding offloading, we introduced a novel Hierarchical Inference (HI) framework, which makes sequential offload decisions between a low-latency, energy-efficient local model and a high-accuracy remote model using locally available confidence measures. We proposed HI algorithms based on thresholds and ambiguity regions learned online by suitably extending the Prediction with Expert Advice (PEA) approaches to continuous expert spaces and partial feedback. HI algorithms minimise the expected cost across inference rounds, combining offloading and misclassification costs, and are shown to achieve a uniformly sublinear regret of O(T2/3).The proposed algorithms are agnostic to model architecture and communication systems, do not alter model training, and support model updates during operation. Benchmarks on standard classification tasks using the softmax output as a confidence measure showed that HI adaptively distributes inference based on offloading costs, achieving results close to the offline optimum. HI is shown to add resilience to distribution changes and model mismatches, especially when asymmetric misclassification costs are present.

In summary, this thesis presents efficient approaches for sampling and offloading of inference tasks, where various performance metrics are combined into a single cost structure. The work extends beyond conventional inference problems to areas with similar trade-offs, advancing toward efficient distributed intelligence that infers at the right time and in the right place. Future work includes conceptual extensions like joint sampling-offloading design, and integration with collaborative model-training architectures.

Abstract [sv]

Den pågående spridningen av intelligenta system, drivna av artificiell intelligens (AI) och 6G, leder till en ökning av återkopplade inferensuppgifter som utförs på distribuerade beräkningsnoder. Dessa system verkar under strikta krav på latens och energiförbrukning, vilket gör att utmaningen inte enbart handlar om att uppnå hög noggrannhet utan också om att möjliggöra snabb och energieffektiv inferens. Denna avhandling undersöker hur distribuerad inferens kan optimeras genom två centrala beslut: när miljön ska samplas och när beräkningar ska avlastas till en mer exakt, fjärrbelägen modell. Dessa beslut styrs av miljöns semantiska egenskaper och de kostnader som är förknippade med dessa. Semantiken hålls på en abstrakt nivå, och förtränade inferensmodeller används, vilket möjliggör en plattformsoberoende formulering som är anpassningsbar till den snabba utvecklingen inom distribuerad intelligens och trådlös kommunikation.

Angående sampling studerades avvägningen mellan samplingskostnad och detektionsfördröjning i händelsedetekteringssystem som saknar tillräcklig lokal inferenskapacitet. Ett optimeringsproblem över samplingstidpunkter formuleras för stokastiska händelser och analyserades på olika nivåer av modelleringskomplexitet, från periodisk till aperiodisk sampling. Slutna, algoritmiska, och approximativa lösningar utvecklades, varav vissa resultat även är av allmänt matematiskt intresse. Simuleringar i realistiska system visade tydliga effektivitetsvinster jämfört med system som bortser från händelsernas semantik. Särskilt aperiodisk sampling uppnådde en stabil förbättring på cirka 10% jämfört med periodiska strategier över olika systemparametrar.

Angående avlastning introducerades ett nytt ramverk för hierarkisk inferens (HI), som fattar sekventiella avlastningsbeslut mellan en lokal modell med låg fördröjning och energiförbrukning, och en fjärrmodell med högre noggrannhet, baserat på lokala konfidensmått. Vi föreslog HI-algoritmer baserade på tröskelvärden och ambiguitetsregioner som lärs in online genom att utvidga metoder för expertbaserad prediktion (Prediction with Expert Advice, PEA) till kontinuerliga expertrum med partiell återkoppling. HI-algoritmerna minimerar den förväntade kostnaden över flera inferensomgångar genom att kombinera kostnader för avlastning och felklassificering, och uppnår O(T2/3) sublinjär ånger. De föreslagna algoritmerna är oberoende av modellarkitektur och kommunikationssystem, kräver ingen ändring av modellträningen, och stödjer modelluppdateringar under drift. Jämförelser på standardiserade klassificeringsuppgifter med softmax-värde som konfidensmått visade att HI fördelar inferens adaptivt beroende på avlastningskostnader och når resultat nära det offline-optimum som beräknats i efterhand. HI visade sig dessutom öka robustheten mot distributionsförändringar och modellavvikelser, särskilt i fall med asymmetriska felklassificeringskostnader.

Sammanfattningsvis presenterar avhandlingen effektiva metoder för sampling och avlastning av inferensuppgifter där olika prestandamått kombineras i en gemensam kostnadsstruktur. Arbetet sträcker sig bortom konventionella inferensproblem till områden med liknande avvägningar, och bidrar till utvecklingen av effektiv distribuerad intelligens som tar beslut vid rätt tidpunkt och på rätt plats. Framtida arbete inkluderar konceptuella utvidgningar såsom gemensam design av sampling och avlastning, samt integration med kollaborativa modellträningsarkitekturer.

sted, utgiver, år, opplag, sider
Stockholm: KTH Royal Institute of Technology, 2025. s. xiii, 87
Serie
TRITA-EECS-AVL ; 2026:4
Emneord
Artificial intelligence, communication, distributed intelligence, inference offloading, Artificiell intelligens, kommunikation, distribuerad intelligens, inferensavlastning
HSV kategori
Forskningsprogram
Elektro- och systemteknik
Identifikatorer
urn:nbn:se:kth:diva-373298 (URN)978-91-8106-482-7 (ISBN)
Disputas
2026-01-16, https://kth-se.zoom.us/s/61617488895, Salongen, Osquars backe 31, KTH Campus, Stockholm, 10:00 (engelsk)
Opponent
Veileder
Merknad

QC 20251127

Tilgjengelig fra: 2025-11-27 Laget: 2025-11-27 Sist oppdatert: 2025-12-09bibliografisk kontrollert
Moothedath, V. N., Champati, J. P. & Gross, J. (2024). Getting the Best Out of Both Worlds: Algorithms for Hierarchical Inference at the Edge. IEEE Transactions on Machine Learning in Communications and Networking, 2, 280-297
Åpne denne publikasjonen i ny fane eller vindu >>Getting the Best Out of Both Worlds: Algorithms for Hierarchical Inference at the Edge
2024 (engelsk)Inngår i: IEEE Transactions on Machine Learning in Communications and Networking, E-ISSN 2831-316X, Vol. 2, s. 280-297Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

We consider a resource-constrained Edge Device (ED), such as an IoT sensor or a microcontroller unit, embedded with a small-size ML model (S-ML) for a generic classification application and an Edge Server (ES) that hosts a large-size ML model (L-ML). Since the inference accuracy of S-ML is lower than that of the L-ML, offloading all the data samples to the ES results in high inference accuracy, but it defeats the purpose of embedding S-ML on the ED and deprives the benefits of reduced latency, bandwidth savings, and energy efficiency of doing local inference. In order to get the best out of both worlds, i.e., the benefits of doing inference on the ED and the benefits of doing inference on ES, we explore the idea of Hierarchical Inference (HI), wherein S-ML inference is only accepted when it is correct, otherwise the data sample is offloaded for L-ML inference. However, the ideal implementation of HI is infeasible as the correctness of the S-ML inference is not known to the ED. We thus propose an online meta-learning framework that the ED can use to predict the correctness of the S-ML inference. In particular, we propose to use the probability corresponding to the maximum probability class output by S-ML for a data sample and decide whether to offload it or not. The resulting online learning problem turns out to be a Prediction with Expert Advice (PEA) problem with continuous expert space. For a full feedback scenario, where the ED receives feedback on the correctness of the S-ML once it accepts the inference, we propose the HIL-F algorithm and prove a sublinear regret bound √ n ln(1/λ min )/2 without any assumption on the smoothness of the loss function, where n is the number of data samples and λ min is the minimum difference between any two distinct maximum probability values across the data samples. For a no-local feedback scenario, where the ED does not receive the ground truth for the classification, we propose the HIL-N algorithm and prove that it has O ( n 2/3 ln 1/3 (1/λ min )) regret bound. We evaluate and benchmark the performance of the proposed algorithms for image classification application using four datasets, namely, Imagenette and Imagewoof [1], MNIST [2], and CIFAR-10 [3].

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2024
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-343505 (URN)10.1109/tmlcn.2024.3366501 (DOI)001488462500001 ()
Forskningsfinansiär
Vinnova, 2019-00031Swedish Research Council, 2022-03922
Merknad

QC 20240216

Tilgjengelig fra: 2024-02-15 Laget: 2024-02-15 Sist oppdatert: 2025-11-27bibliografisk kontrollert
Letsiou, A., Moothedath, V. N., Behera, A. P., Champati, J. P. & Gross, J. (2024). Hierarchical Inference at the Edge: A Batch Processing Approach. In: Proceedings - 2024 IEEE/ACM Symposium on Edge Computing, SEC 2024: . Paper presented at 9th Annual IEEE/ACM Symposium on Edge Computing, SEC 2024, Rome, Italy, December 4-7, 2024 (pp. 476-482). Institute of Electrical and Electronics Engineers (IEEE)
Åpne denne publikasjonen i ny fane eller vindu >>Hierarchical Inference at the Edge: A Batch Processing Approach
Vise andre…
2024 (engelsk)Inngår i: Proceedings - 2024 IEEE/ACM Symposium on Edge Computing, SEC 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, s. 476-482Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Deep learning (DL) applications have rapidly evolved to address increasingly complex tasks by leveraging large-scale, resource-intensive models. However, deploying such models on low-power devices is not practical or economically scalable. While cloud-centric solutions satisfy these computational demands, they present challenges in terms of communication costs and latencies for real-Time applications when every computation task is offloaded. To mitigate these concerns, hierarchical inference (HI) frameworks have been proposed, enabling edge devices equipped with small ML models to collaborate with edge servers by selectively offloading complex tasks. Existing HI approaches depend on immediate offloading of data upon selection, which can lead to inefficiencies due to frequent communication, especially in time-varying wireless environments. In this work, we introduce Batch HI, an approach that offloads samples in batches, thereby reducing communication overhead and improving system efficiency while achieving similar performance as existing HI methods. Additionally, we find the optimal batch size that attains a crucial balance between responsiveness and system time, tailored to specific user requirements. Numerical results confirm the effectiveness of our approach, highlighting the scenarios where batching is particularly beneficial.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2024
Emneord
batching, edge computing, Hierarchical inference, offloading decisions, regret bound, responsiveness, tiny ML
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-359857 (URN)10.1109/SEC62691.2024.00055 (DOI)001424939400046 ()2-s2.0-85216793011 (Scopus ID)
Konferanse
9th Annual IEEE/ACM Symposium on Edge Computing, SEC 2024, Rome, Italy, December 4-7, 2024
Merknad

Part of ISBN 979-8-3503-7828-3

QC 20250213

Tilgjengelig fra: 2025-02-12 Laget: 2025-02-12 Sist oppdatert: 2025-08-06bibliografisk kontrollert
Moothedath, V. N., Champati, J. P. & Gross, J. (2023). Energy Efficient Sampling Policies for Edge Computing Feedback Systems. IEEE Transactions on Mobile Computing, 22(8), 4634-4647
Åpne denne publikasjonen i ny fane eller vindu >>Energy Efficient Sampling Policies for Edge Computing Feedback Systems
2023 (engelsk)Inngår i: IEEE Transactions on Mobile Computing, ISSN 1536-1233, E-ISSN 1558-0660, Vol. 22, nr 8, s. 4634-4647Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

We study the problem of finding efficient sampling policies in an edge-based feedback system, where sensor samples are offloaded to a back-end server that processes them and generates feedback to a user. Sampling the system at maximum frequency results in the detection of events of interest with minimum delay but incurs higher energy costs due to the communication and processing of redundant samples. On the other hand, lower sampling frequency results in higher delay in detecting the event, thus increasing the idle energy usage and degrading the quality of experience. We quantify this trade-off as a weighted function between the number of samples and the sampling interval. We solve the minimisation problem for exponential and Rayleigh distributions, for the random time to the event of interest. We prove the convexity of the objective functions by using novel techniques, which can be of independent interest elsewhere. We argue that adding an initial offset to the periodic sampling can further reduce the energy consumption and jointly compute the optimum offset and sampling interval. We apply our framework to two practically relevant applications and show energy savings of up to 36% when compared to an existing periodic scheme. 

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2023
Emneord
cyber physical systems, Delays, Edge computing, Energy consumption, Energy minimisation, Event detection, feedback systems, Image edge detection, Monitoring, optimal sampling, video analytics systems, Visual analytics, Cyber Physical System, Economic and social effects, Edge detection, Embedded systems, Energy efficiency, Feedback, Green computing, Quality of service, Analytics systems, Cybe-physical systems, Cyber-physical systems, Delay, Energy minimization, Energy-consumption, Events detection, Video analytic system, Video analytics, Energy utilization
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-322988 (URN)10.1109/TMC.2022.3165852 (DOI)001022084500019 ()2-s2.0-85128254606 (Scopus ID)
Merknad

QC 20251222

Tilgjengelig fra: 2023-01-11 Laget: 2023-01-11 Sist oppdatert: 2025-12-22bibliografisk kontrollert
Mostafavi, S. S., Moothedath, V. N., Rönngren, S., Roy, N., Sharma, G. P., Seo, S., . . . Gross, J. (2023). ExPECA: An Experimental Platform for Trustworthy Edge Computing Applications. In: 2023 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING, SEC 2023: . Paper presented at 8th Annual IEEE/ACM Symposium on Edge Computing (SEC), DEC 06-09, 2023, Wilmington, DE (pp. 294-299). Association for Computing Machinery (ACM)
Åpne denne publikasjonen i ny fane eller vindu >>ExPECA: An Experimental Platform for Trustworthy Edge Computing Applications
Vise andre…
2023 (engelsk)Inngår i: 2023 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING, SEC 2023, Association for Computing Machinery (ACM), 2023, s. 294-299Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

This paper presents ExPECA, an edge computing and wireless communication research testbed designed to tackle two pressing challenges: comprehensive end-to-end experimentation and high levels of experimental reproducibility. Leveraging OpenStack-based Chameleon Infrastructure (CHI) framework for its proven flexibility and ease of operation, ExPECA is located in a unique, isolated underground facility, providing a highly controlled setting for wireless experiments. The testbed is engineered to facilitate integrated studies of both communication and computation, offering a diverse array of Software-Defined Radios (SDR) and Commercial Off-The-Shelf (COTS) wireless and wired links, as well as containerized computational environments. We exemplify the experimental possibilities of the testbed using OpenRTiST, a latency-sensitive, bandwidthintensive application, and analyze its performance. Lastly, we highlight an array of research domains and experimental setups that stand to gain from ExPECA's features, including closed-loop applications and time-sensitive networking.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2023
Serie
IEEE-ACM Symposium on Edge Computing, ISSN 2837-4819
Emneord
Edge computing experimental platform, reproducibility, end-to-end experimentation, wireless testbed
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-344956 (URN)10.1145/3583740.3626819 (DOI)001164050000036 ()2-s2.0-85182828620 (Scopus ID)
Konferanse
8th Annual IEEE/ACM Symposium on Edge Computing (SEC), DEC 06-09, 2023, Wilmington, DE
Merknad

QC 20240408

Part of ISBN 979-8-4007-0123-8

Tilgjengelig fra: 2024-04-08 Laget: 2024-04-08 Sist oppdatert: 2025-05-09bibliografisk kontrollert
Moothedath, V. N., Champati, J. P. & Gross, J. (2023). Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge.
Åpne denne publikasjonen i ny fane eller vindu >>Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge
2023 (engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

We consider a resource-constrained Edge Device (ED), such as an IoT sensor or a microcontroller unit, embedded with a small-size ML model (S-ML) for a generic classification application and an Edge Server (ES) that hosts a large-size ML model (L-ML). Since the inference accuracy of S-ML is lower than that of the L-ML, offloading all the data samples to the ES results in high inference accuracy, but it defeats the purpose of embedding S-ML on the ED and deprives the benefits of reduced latency, bandwidth savings, and energy efficiency of doing local inference. In order to get the best out of both worlds, i.e., the benefits of doing inference on the ED and the benefits of doing inference on ES, we explore the idea of Hierarchical Inference (HI), wherein S-ML inference is only accepted when it is correct, otherwise the data sample is offloaded for L-ML inference. However, the ideal implementation of HI is infeasible as the correctness of the S-ML inference is not known to the ED. We propose an online meta-learning framework that the ED can use to predict the correctness of the S-ML inference. In particular, we propose to use the maximum softmax value output by S-ML for a data sample and decide whether to offload it or not. The resulting online learning problem turns out to be a Prediction with Expert Advice (PEA) problem with continuous expert space. We propose two different algorithms and prove sublinear regret bounds for them without any assumption on the smoothness of the loss function. We evaluate and benchmark the performance of the proposed algorithms for image classification application using four datasets, namely, Imagenette and Imagewoof, MNIST, and CIFAR-10. 

Emneord
Machine Learning (cs.LG), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-369817 (URN)10.48550/ARXIV.2304.00891 (DOI)
Merknad

QC 20250918

Tilgjengelig fra: 2025-09-15 Laget: 2025-09-15 Sist oppdatert: 2025-09-18bibliografisk kontrollert
Alsakati, M., Pettersson, C., Max, S., Moothedath, V. N. & Gross, J. (2023). Performance of 802.11be Wi-Fi 7 with Multi-Link Operation on AR Applications. In: 2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC: . Paper presented at IEEE Wireless Communications and Networking Conference (WCNC), MAR 26-29, 2023, Glasgow, SCOTLAND. Institute of Electrical and Electronics Engineers (IEEE)
Åpne denne publikasjonen i ny fane eller vindu >>Performance of 802.11be Wi-Fi 7 with Multi-Link Operation on AR Applications
Vise andre…
2023 (engelsk)Inngår i: 2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, Institute of Electrical and Electronics Engineers (IEEE) , 2023Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Since its first release in the late 1990s, Wi-Fi has been updated to keep up with evolving user needs. Recently, Wi-Fi and other radio access technologies have been pushed to their edge when serving Augmented Reality (AR) applications. AR applications require high throughput, low latency, and high reliability to ensure a high-quality user experience. The 802.11be amendment - which will be marketed as Wi-Fi 7 - introduces several features that aim to enhance its capabilities to support challenging applications like AR. One of the main features introduced in this amendment is Multi-Link Operation (MLO) which allows nodes to transmit and receive over multiple links concurrently. When using MLO, traffic is distributed among links using an implementation-specific traffic-to-link allocation policy. This paper aims to evaluate the performance of MLO, using different policies, in serving AR applications compared to Single-Link (SL). Experimental simulations using an event-based Wi-Fi simulator have been conducted. Our results show the general superiority of MLO when serving AR applications. MLO achieves lower latency and serves a higher number of AR users compared to SL with the same frequency resources. In addition, increasing the number of links can improve the performance of MLO. Regarding traffic-to-link allocation policies, we found that policies can be more susceptible to channel blocking, resulting in possible performance degradation.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2023
Serie
IEEE Wireless Communications and Networking Conference, ISSN 1525-3511
Emneord
Wi-Fi 7, IEEE 802.11be, MLO, Multi-link, AR
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-329378 (URN)10.1109/WCNC55385.2023.10118866 (DOI)000989491900227 ()2-s2.0-85159783103 (Scopus ID)
Konferanse
IEEE Wireless Communications and Networking Conference (WCNC), MAR 26-29, 2023, Glasgow, SCOTLAND
Merknad

QC 20231122

Tilgjengelig fra: 2023-06-20 Laget: 2023-06-20 Sist oppdatert: 2023-11-22bibliografisk kontrollert
Al-Atat, G., Fresa, A., Behera, A. P., Moothedath, V. N., Gross, J. & Champati, J. P. (2023). The Case for Hierarchical Deep Learning Inference at the Network Edge. In: NetAISys 2023 - Proceedings of the 1st International Workshop on Networked AI Systems, Part of MobiSys 2023: . Paper presented at 1st International Workshop on Networked AI Systems, NetAISys 2023, co-located with ACM MobiSys 2023, Helsinki, Finland, Jun 18 2023 (pp. 13-18). Association for Computing Machinery (ACM)
Åpne denne publikasjonen i ny fane eller vindu >>The Case for Hierarchical Deep Learning Inference at the Network Edge
Vise andre…
2023 (engelsk)Inngår i: NetAISys 2023 - Proceedings of the 1st International Workshop on Networked AI Systems, Part of MobiSys 2023, Association for Computing Machinery (ACM) , 2023, s. 13-18Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Resource-constrained Edge Devices (EDs), e.g., IoT sensors and microcontroller units, are expected to make intelligent decisions using Deep Learning (DL) inference at the edge of the network. Toward this end, developing tinyML models is an area of active research - DL models with reduced computation and memory storage requirements - that can be embedded on these devices. However, tinyML models have lower inference accuracy. On a different front, DNN partitioning and inference offloading techniques were studied for distributed DL inference between EDs and Edge Servers (ESs). In this paper, we explore Hierarchical Inference (HI), a novel approach proposed in [19] for performing distributed DL inference at the edge. Under HI, for each data sample, an ED first uses a local algorithm (e.g., a tinyML model) for inference. Depending on the application, if the inference provided by the local algorithm is incorrect or further assistance is required from large DL models on edge or cloud, only then the ED offloads the data sample. At the outset, HI seems infeasible as the ED, in general, cannot know if the local inference is sufficient or not. Nevertheless, we present the feasibility of implementing HI for image classification applications. We demonstrate its benefits using quantitative analysis and show that HI provides a better trade-off between offloading cost, throughput, and inference accuracy compared to alternate approaches.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2023
Emneord
deep learning, edge computing, hierarchical inference
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-334523 (URN)10.1145/3597062.3597278 (DOI)001119206300003 ()2-s2.0-85164295939 (Scopus ID)
Konferanse
1st International Workshop on Networked AI Systems, NetAISys 2023, co-located with ACM MobiSys 2023, Helsinki, Finland, Jun 18 2023
Merknad

Part of ISBN 9798400702129

QC 20230823

Tilgjengelig fra: 2023-08-23 Laget: 2023-08-23 Sist oppdatert: 2025-07-16bibliografisk kontrollert
Olguín Muñoz, M. O., Mostafavi, S. S., Moothedath, V. N. & Gross, J. (2022). Ainur: A Framework for Repeatable End-to-End Wireless Edge Computing Testbed Research. In: European Wireless Conference, EW 2022: . Paper presented at 2022 European Wireless Conference, EW 2022, Dresden, Germany, Sep 19 2022 - Sep 21 2022 (pp. 139-145). VDE VERLAG GMBH
Åpne denne publikasjonen i ny fane eller vindu >>Ainur: A Framework for Repeatable End-to-End Wireless Edge Computing Testbed Research
2022 (engelsk)Inngår i: European Wireless Conference, EW 2022, VDE VERLAG GMBH , 2022, s. 139-145Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Experimental research on wireless networking in combination with edge and cloud computing has been the subject of explosive interest in the last decade. This development has been driven by the increasing complexity of modern wireless technologies and the extensive softwarization of these through projects such as a Open Radio Access Network (O-RAN). In this context, a number of small- to mid-scale testbeds have emerged, employing a variety of technologies to target a wide array of use-cases and scenarios in the context of novel mobile communication technologies such as 5G and beyond-5G. Little work, however, has yet been devoted to developing a standard framework for wireless testbed automation which is hardwareagnostic and compatible with edge- and cloud-native technologies. Such a solution would simplify the development of new testbeds by completely or partially removing the requirement for custom management and orchestration software. In this paper, we present the first such mostly hardwareagnostic wireless testbed automation framework, Ainur. It is designed to configure, manage, orchestrate, and deploy workloads from an end-to-end perspective. Ainur is built on top of cloudnative technologies such as Docker, and is provided as FOSS to the community through the KTH-EXPECA/Ainur repository on GitHub. We demonstrate the utility of the platform with a series of scenarios, showcasing in particular its flexibility with respect to physical link definition, computation placement, and automation of arbitrarily complex experimental scenarios.

sted, utgiver, år, opplag, sider
VDE VERLAG GMBH, 2022
Emneord
Edge Computing, Testbed, Automation, Experimental Research
HSV kategori
Forskningsprogram
Datalogi
Identifikatorer
urn:nbn:se:kth:diva-327217 (URN)2-s2.0-85172030303 (Scopus ID)
Konferanse
2022 European Wireless Conference, EW 2022, Dresden, Germany, Sep 19 2022 - Sep 21 2022
Forskningsfinansiär
Swedish Foundation for Strategic Research, ITM17–0246
Merknad

Part of ISBN 9781713865698

QC 20230525

Tilgjengelig fra: 2023-05-22 Laget: 2023-05-22 Sist oppdatert: 2023-10-09bibliografisk kontrollert
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-2739-5060