kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 11) Show all publications
Katsikas, G. P., Barbette, T., Kostic, D., Maguire Jr., G. Q. & Steinert, R. (2021). Metron: High-Performance NFV Service Chaining Even in the Presence of Blackboxes. ACM Transactions on Computer Systems, 38(1-2), 1-45, Article ID 3.
Open this publication in new window or tab >>Metron: High-Performance NFV Service Chaining Even in the Presence of Blackboxes
Show others...
2021 (English)In: ACM Transactions on Computer Systems, ISSN 0734-2071, E-ISSN 1557-7333, Vol. 38, no 1-2, p. 1-45, article id 3Article in journal (Refereed) Published
Abstract [en]

Deployment of 100 Gigabit Ethernet (GbE) links challenges the packet processing limits of commodity hardware used for Network Functions Virtualization (NFV). Moreover, realizing chained network functions (i.e., service chains) necessitates the use of multiple CPU cores, or even multiple servers, to process packets from such high speed links.

Our system Metron jointly exploits the underlying network and commodity servers' resources: (i) to offload part of the packet processing logic to the network, (ii) by using smart tagging to setup and exploit the affinity of traffic classes, and (iii) by using tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers' cores, with zero inter-core communication. Moreover, Metron transparently integrates, manages, and load balances proprietary "blackboxes" together with Metron service chains.

Metron realizes stateful network functions at the speed of 100 GbE network cards on a single server, while elastically and rapidly adapting to changing workload volumes. Our experiments demonstrate that Metron service chains can coexist with heterogeneous blackboxes, while still leveraging Metron's accurate dispatching and load balancing. In summary, Metron has (i) 2.75-8× better efficiency, up to (ii) 4.7× lower latency, and (iii) 7.8× higher throughput than OpenBox, a state-of-the-art NFV system.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2021
Keywords
elasticity, service chains, hardware offloading, accurate dispatching, 100 GbE, load balancing, tagging, blackboxes, NFV
National Category
Communication Systems Computer Sciences
Identifiers
urn:nbn:se:kth:diva-298691 (URN)10.1145/3465628 (DOI)000679809300003 ()2-s2.0-85111657554 (Scopus ID)
Projects
European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 770889)Swedish Foundation for Strategic Research (SSF)
Note

QC 20210712

Available from: 2021-07-11 Created: 2021-07-11 Last updated: 2024-03-15
Behravesh, R., Perez-Ramirez, D. F., Rao, A., Harutyunyan, D., Riggio, R. & Steinert, R. (2020). ML-Driven DASH Content Pre-Fetching in MEC-Enabled Mobile Networks. In: 2020 16th International Conference on Network and Service Management (CNSM): . Paper presented at 2020 16th International Conference on Network and Service Management (CNSM). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>ML-Driven DASH Content Pre-Fetching in MEC-Enabled Mobile Networks
Show others...
2020 (English)In: 2020 16th International Conference on Network and Service Management (CNSM), Institute of Electrical and Electronics Engineers (IEEE) , 2020Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-363715 (URN)10.23919/CNSM50824.2020.9269054 (DOI)000612229200018 ()2-s2.0-85098664427 (Scopus ID)
Conference
2020 16th International Conference on Network and Service Management (CNSM)
Note

QC 20250526

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-26Bibliographically approved
Liu, S., Steinert, R. & Kostic, D. (2018). Flexible distributed control plane deployment. In: Proceedings 2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018: Cognitive Management in a Cyber World, NOMS 2018. Paper presented at 2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018, Taipei, Taiwan, April 23-27, 2018 (pp. 1-7). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Flexible distributed control plane deployment
2018 (English)In: Proceedings 2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018: Cognitive Management in a Cyber World, NOMS 2018, Institute of Electrical and Electronics Engineers (IEEE) , 2018, p. 1-7Conference paper, Published paper (Refereed)
Abstract [en]

For large-scale programmable networks, flexible deployment of distributed control planes is essential for service availability and performance. However, existing approaches only focus on placing controllers whereas the consequent control traffic is often ignored. In this paper, we propose a black-box optimization framework offering the additional steps for quanti-fying the effect of the consequent control traffic when deploying a distributed control plane. Evaluating different implementations of the framework over real-world topologies shows that close to optimal solutions can be achieved. Moreover, experiments indicate that running a method for controller placement without considering the control traffic, cause excessive bandwidth usage (worst cases varying between 20.1%-50.1% more) and congestion, compared to our approach.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2018
Series
IEEE IFIP Network Operations and Management Symposium, ISSN 1542-1201
Keywords
Optimization, Traffic congestion, Bandwidth usage, Black-box optimization, Control traffic, Controller placements, Distributed control planes, Optimal solutions, Programmable network, Service availability, Controllers
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-238085 (URN)10.1109/NOMS.2018.8406150 (DOI)000541820800038 ()2-s2.0-85050656041 (Scopus ID)
Conference
2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018, Taipei, Taiwan, April 23-27, 2018
Note

Part of proceedings: ISBN 978-1-5386-3416-5

QC 20190111

Available from: 2019-01-11 Created: 2019-01-11 Last updated: 2022-09-26Bibliographically approved
Katsikas, G. P., Barbette, T., Kostic, D., Steinert, R. & Maguire Jr., G. Q. (2018). Metron: NFV Service Chains at the True Speed of the Underlying Hardware. In: : . Paper presented at The 15th USENIX Symposium on Networked Systems Design and Implementation.
Open this publication in new window or tab >>Metron: NFV Service Chains at the True Speed of the Underlying Hardware
Show others...
2018 (English)Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers’ resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ fastest cache(s), with zero inter-core communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering.

Keywords
NFV, service chains, offloading, hardware dispatching, high performance
National Category
Computer Sciences Communication Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-223543 (URN)
Conference
The 15th USENIX Symposium on Networked Systems Design and Implementation
Projects
Time-Critical CloudsWASP
Funder
Swedish Foundation for Strategic Research Knut and Alice Wallenberg Foundation
Available from: 2018-02-22 Created: 2018-02-22 Last updated: 2024-03-15Bibliographically approved
Rao, A. & Steinert, R. (2018). Probabilistic multi-RAT performance abstractions. In: NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium: . Paper presented at NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.
Open this publication in new window or tab >>Probabilistic multi-RAT performance abstractions
2018 (English)In: NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium, 2018Conference paper, Published paper (Refereed)
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-363712 (URN)
Conference
NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium
Note

QC 20250526

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-26Bibliographically approved
Steinert, R. & Gillblad, D. (2012). Performance Evaluation of a Distributed and Probabilistic Network Monitoring Approach. In: : . Paper presented at The 8th International Conference on Network and Service Management (CNSM).
Open this publication in new window or tab >>Performance Evaluation of a Distributed and Probabilistic Network Monitoring Approach
2012 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We investigate the effects of employing a probabilistic fault detection approach relative the performance of a deterministic network monitoring method. The approach has its foundation in probabilistic network management, in which performance limits and thresholds are specified in terms of e.g. probabilities or belief values. When combined with adaptive mechanisms, probabilistic approaches can potentially offer improved controllability, adaptivity and reliability, compared to deterministic monitoring methods. Results from synthetically generated and real network QoS measurements indicate that the probabilistic approach generally can perform at least as good as a deterministic algorithm, with a higher degree of predictable performance and resource-efficiency. Due to the stochastic nature of the algorithm, worse performance than expected is sometimes observed. Nevertheless, the results give additional support to some of the practical benefits expected in using probabilistic approaches for network management purposes.

National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-144606 (URN)2-s2.0-84872084126 (Scopus ID)
Conference
The 8th International Conference on Network and Service Management (CNSM)
Note

QC 20140509

Available from: 2014-04-27 Created: 2014-04-27 Last updated: 2024-03-15Bibliographically approved
Steinert, R., Gestrelius, S. & Gillblad, D. (2011). A Distributed Spatio-Temporal Event Correlation Protocol for Multi-Layer Virtual Networks. In: : . Paper presented at IEEE Global Telecommunications Conference (GLOBECOM).
Open this publication in new window or tab >>A Distributed Spatio-Temporal Event Correlation Protocol for Multi-Layer Virtual Networks
2011 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We present a distributed spatio-temporal event correlation protocol for multi-layer networks. The problems that we address relate to scalability in stacked overlay networks and network equipment with asynchronous clocks, which complicates the problem of event correlation. We describe a cross-layer protocol designed to address these problems, operating in a fully distributed manner and taking into account asynchronous timestamps. It is assumed that events in one layer may arise from a series of events in lower layers. Detected events that are spatially related in one layer are aggregated using a gossip-like protocol, and constitute a root cause. The set of aggregated events is disseminated to lower layers and used for temporal correlation. We have tested the scalability and the performance of the distributed event protocol, using both synthetically generated and real-world topologies. The results indicate that the average overhead produced for collecting events down the stack of overlays increases with the number of layers. For a fixed number of layers, the protocol scales similarly with the graph-theoretic properties for a network of increasing size.

National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-144605 (URN)10.1109/GLOCOM.2011.6133988 (DOI)000300509002116 ()2-s2.0-84857207708 (Scopus ID)
Conference
IEEE Global Telecommunications Conference (GLOBECOM)
Note

QC 20140509

Available from: 2014-04-27 Created: 2014-04-27 Last updated: 2024-03-15Bibliographically approved
Gonzales Prieto, A., Gillblad, D., Steinert, R. & Miron, A. (2011). Toward Decentralized Probabilistic Management. IEEE Communications Magazine, 49(7), 80-96
Open this publication in new window or tab >>Toward Decentralized Probabilistic Management
2011 (English)In: IEEE Communications Magazine, ISSN 0163-6804, E-ISSN 1558-1896, Vol. 49, no 7, p. 80-96Article in journal (Refereed) Published
Abstract [en]

In recent years, data communication networks have grown to immense size and have been diversified by the mobile revolution. Existing management solutions are based on a centralized deterministic paradigm, which is appropriate for networks of moderate size operating in relatively stable conditions. However, it is becoming increasingly apparent that these management solutions are not able to cope with the large dynamic networks that are emerging. In this article, we argue that the adoption of a decentralized and probabilistic paradigm for network management will be crucial to meet the challenges of future networks, such as efficient resource usage, scalability, robustness, and adaptability. We discuss the potential of decentralized probabilistic management and its impact on management operations, and illustrate the paradigm by three example solutions for real-time monitoring and anomaly detection.

National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-144604 (URN)10.1109/MCOM.2011.5936159 (DOI)000292376000010 ()2-s2.0-79959961883 (Scopus ID)
Note

QC 20140509

Available from: 2014-04-27 Created: 2014-04-27 Last updated: 2024-03-15Bibliographically approved
Steinert, R. & Gillblad, D. (2010). Long-Term Adaptation and Distributed Detection of Local Network Changes. In: : . Paper presented at IEEE Global Telecommunications Conference (GLOBECOM).
Open this publication in new window or tab >>Long-Term Adaptation and Distributed Detection of Local Network Changes
2010 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We present a statistical approach to distributed detection of local latency shifts in networked systems. For this purpose, response delay measurements are performed between neighbouring nodes via probing. The expected probe response delay on each connection is statistically modelled via parameter estimation. Adaptation to drifting delays is accounted for by the use of overlapping models, such that previous models are partially used as input to future models. Based on the symmetric Kullback-Leibler divergence metric, latency shifts can be detected by comparing the estimated parameters of the current and previous models. In order to reduce the number of detection alarms, thresholds for divergence and convergence are used. The method that we propose can be applied to many types of statistical distributions, and requires only constant memory compared to e.g., sliding window techniques and decay functions. Therefore, the method is applicable in various kinds of network equipment with limited capacity, such as sensor networks, mobile ad hoc networks etc. We have investigated the behaviour of the method for different model parameters. Further, we have tested the detection performance in network simulations, for both gradual and abrupt shifts in the probe response delay. The results indicate that over 90% of the shifts can be detected. Undetected shifts are mainly the effects of long convergence processes triggered by previous shifts. The overall performance depends on the characteristics of the shifts and the configuration of the model parameters.

National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-144603 (URN)10.1109/GLOCOM.2010.5684137 (DOI)000287977405109 ()2-s2.0-79551638217 (Scopus ID)
Conference
IEEE Global Telecommunications Conference (GLOBECOM)
Note

QC 20140509

Available from: 2014-04-27 Created: 2014-04-27 Last updated: 2024-03-15Bibliographically approved
Bohlin, M., Doganay, K., Kreuger, P., Steinert, R. & Wärja, M. (2010). Searching for gas turbine maintenance schedules. The AI Magazine, 31(1), 21-36
Open this publication in new window or tab >>Searching for gas turbine maintenance schedules
Show others...
2010 (English)In: The AI Magazine, ISSN 0738-4602, E-ISSN 2371-9621, Vol. 31, no 1, p. 21-36Article in journal (Refereed) Published
Abstract [en]

Preventive-maintenance schedules occurring in industry are often suboptimal with regard to maintenance coallocation, loss-of-production costs, and availability. We describe the implementation and deployment of a software decision support, tool for the maintenance planning of gas turbines, with the goal of reducing the direct maintenance costs and the often costly production losses during maintenance down time. The optimization problem is formally defined, and we argue that the feasibility version is NP-complete. We outline a heuristic algorithm that can quickly solve the problem for practical purposes and validate the approach on a real-world scenario based on an oil production facility. We also compare the performance of our algorithm with result's from using integer programming and d'iscuss the deployment of the application. The experimental results indicate that down time reductions up to 65 percent can be achieved, compared to traditional preventive maintenance. In addition, the use of our tool is expected to improve availability by up to 1 percent and to reduce the number of planned maintenance days by 12 percent. Compared to an integer programming approach, our algorithm is not optimal but is much faster and produces results that are useful in practice. Our test results and SIT AB's estimates based on operational use both indicate that significant savings can be achieved by using our software tool, compared to maintenance plans with fixed intervals.

Keywords
Co-allocation, Decision supports, Direct maintenance costs, Down time, Maintenance down time, Maintenance planning, Maintenance plans, Maintenance schedules, NP Complete, Oil production, Operational use, Optimization problems, Planned maintenance, Production cost, Production loss, Real-world scenario, Software tool, Test results, Turbine maintenance
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-150049 (URN)10.1609/aimag.v31i1.2286 (DOI)000276177700002 ()2-s2.0-79951935661 (Scopus ID)
Note

QC 20140910

Available from: 2014-09-10 Created: 2014-08-29 Last updated: 2025-08-28Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-5893-7774

Search in DiVA

Show all publications