kth.sePublications
Change search
Link to record
Permanent link

Direct link
Stadler, Rolf, Prof.ORCID iD iconorcid.org/0000-0001-6039-8493
Alternative names
Publications (10 of 76) Show all publications
Samani, F. S. & Stadler, R. (2024). A Framework for dynamically meeting performance objectives on a service mesh.
Open this publication in new window or tab >>A Framework for dynamically meeting performance objectives on a service mesh
2024 (English)Manuscript (preprint) (Other academic)
Abstract [en]

We present a framework for achieving end-to-end management objectives for multiple services that concurrently execute on a service mesh. We apply reinforcement learning (RL) techniques to train an agent that periodically performs control actions to reallocate resources. We develop and evaluate the framework using a laboratory testbed where we run information and computing services on a service mesh, supported by the Istio and Kubernetes platforms. We investigate different management objectives that include end-to-end delay bounds on service requests, throughput objectives, cost-related objectives, and service differentiation. Our framework supports the design of a control agent for a given management objective. It is novel in that it advocates a top-down approach whereby the management objective is defined first and then mapped onto the available control actions. Several types of control actions can be executed simultaneously, which allows for efficient resource utilization. Second, the framework separates learning of the system model and the operating region from learning of the control policy. By first learning the system model and the operating region from testbed traces, we can instantiate a simulator and train the agent for different management objectives in parallel. Third, the use of a simulator shortens the training time by orders of magnitude compared with training the agent on the testbed. We evaluate the learned policies on the testbed and show the effectiveness of our approach in several scenarios. In one scenario, we design a controller that achieves the management objectives with $50\%$ less system resources than Kubernetes HPA autoscaling.

National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:kth:diva-346583 (URN)
Note

QC 20240522

Available from: 2024-05-18 Created: 2024-05-18 Last updated: 2024-05-22Bibliographically approved
Samani, F. S., Larsson, H., Damberg, S., Johnsson, A. & Stadler, R. (2024). Comparing Transfer Learning and Rollout for Policy Adaptation in a Changing Network Environment. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024: . Paper presented at 2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Comparing Transfer Learning and Rollout for Policy Adaptation in a Changing Network Environment
Show others...
2024 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

Dynamic resource allocation for network services is pivotal for achieving end-to-end management objectives. Previous research has demonstrated that Reinforcement Learning (RL) is a promising approach to resource allocation in networks, allowing to obtain near-optimal control policies for non-trivial system configurations. Current RL approaches however have the drawback that a change in the system or the management objective necessitates expensive retraining of the RL agent. To tackle this challenge, practical solutions including offline retraining, transfer learning, and model-based rollout have been proposed. In this work, we study these methods and present comparative results that shed light on their respective performance and benefits. Our study finds that rollout achieves faster adaptation than transfer learning, yet its effectiveness highly depends on the accuracy of the system model.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Istio, Kubernetes, Performance management, policy adaptation, reinforcement learning, rollout, service mesh
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-351010 (URN)10.1109/NOMS59830.2024.10575398 (DOI)001270140300103 ()2-s2.0-85198375028 (Scopus ID)
Conference
2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024
Note

Part of ISBN 9798350327939

QC 20240725

Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2024-09-27Bibliographically approved
Hammar, K. & Stadler, R. (2024). Intrusion tolerance for networked systems through two-level feedback control. In: Proceedings - 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024: . Paper presented at 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024, June 24-27 2024, Brisbane, Australia (pp. 338-352). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Intrusion tolerance for networked systems through two-level feedback control
2024 (English)In: Proceedings - 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 338-352Conference paper, Published paper (Refereed)
Abstract [en]

We formulate intrusion tolerance for a system with service replicas as a two-level optimal control problem. On the local level node controllers perform intrusion recovery, and on the global level a system controller manages the replication factor. The local and global control problems can be formulated as classical problems in operations research, namely, the machine replacement problem and the inventory replenishment problem. Based on this formulation, we design TOLERANCE, a novel control architecture for intrusion-tolerant systems. We prove that the optimal control strategies on both levels have threshold structure and design efficient algorithms for computing them. We implement and evaluate TOLERANCE in an emulation environment where we run 10 types of network intrusions. The results show that TOLERANCE can improve service availability and reduce operational cost compared with state-of-the-art intrusion-tolerant systems.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
BFT, Byzantine fault tolerance, CMDP, intrusion recovery, Intrusion tolerance, MDP, optimal control, POMDP
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-353946 (URN)10.1109/DSN58291.2024.00042 (DOI)2-s2.0-85203812073 (Scopus ID)
Conference
54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024, June 24-27 2024, Brisbane, Australia
Note

Part of ISBN: 979-8-3503-4105-8

QC 20240926

Available from: 2024-09-25 Created: 2024-09-25 Last updated: 2024-09-26Bibliographically approved
Wang, X. & Stadler, R. (2024). IT Intrusion Detection Using Statistical Learning and Testbed Measurements. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024: . Paper presented at 2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>IT Intrusion Detection Using Statistical Learning and Testbed Measurements
2024 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

We study automated intrusion detection in an IT infrastructure, specifically the problem of identifying the start of an attack, the type of attack, and the sequence of actions an attacker takes, based on continuous measurements from the infrastructure. We apply statistical learning methods, including Hidden Markov Model (HMM), Long Short-Term Memory (LSTM), and Random Forest Classifier (RFC) to map sequences of observations to sequences of predicted attack actions. In contrast to most related research, we have abundant data to train the models and evaluate their predictive power. The data comes from traces we generate on an in-house testbed where we run attacks against an emulated IT infrastructure. Central to our work is a machine-learning pipeline that maps measurements from a high-dimensional observation space to a space of low dimensionality or to a small set of observation symbols. Investigating intrusions in offline as well as online scenarios, we find that both HMM and LSTM can be effective in predicting attack start time, attack type, and attack actions. If sufficient training data is available, LSTM achieves higher prediction accuracy than HMM. HMM, on the other hand, requires less computational resources and less training data for effective prediction. Also, we find that the methods we study benefit from data produced by traditional intrusion detection systems like SNORT.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
automated security, forensics, Hidden Markov Model, intrusion detection, Long Short-Term Memory, SNORT
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-351006 (URN)10.1109/NOMS59830.2024.10575087 (DOI)001270140300036 ()2-s2.0-85198353664 (Scopus ID)
Conference
2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024
Note

QC 20241007

Part of ISBN 9798350327939

Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2024-10-07Bibliographically approved
Hammar, K. & Stadler, R. (2024). Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. IEEE Transactions on Network and Service Management, 21(1), 1158-1177
Open this publication in new window or tab >>Learning Near-Optimal Intrusion Responses Against Dynamic Attackers
2024 (English)In: IEEE Transactions on Network and Service Management, E-ISSN 1932-4537, Vol. 21, no 1, p. 1158-1177Article in journal (Refereed) Published
Abstract [en]

We study automated intrusion response and formulate the interaction between an attacker and a defender as an optimal stopping game where attack and defense strategies evolve through reinforcement learning and self-play. The game-theoretic modeling enables us to find defender strategies that are effective against a dynamic attacker, i.e., an attacker that adapts its strategy in response to the defender strategy. Further, the optimal stopping formulation allows us to prove that best response strategies have threshold properties. To obtain near-optimal defender strategies, we develop Threshold Fictitious Self-Play (T-FP), a fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that T-FP outperforms a state-of-the-art algorithm for our use case. The experimental part of this investigation includes two systems: a simulation system where defender strategies are incrementally learned and an emulation system where statistics are collected that drive simulation runs and where learned strategies are evaluated. We argue that this approach can produce effective defender strategies for a practical IT infrastructure.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Games, Security, Emulation, Reinforcement learning, Observability, Logic gates, History, Cybersecurity, network security, automated security, intrusion response, optimal stopping, Dynkin games, game theory, Markov decision process, MDP, POMDP
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-345922 (URN)10.1109/TNSM.2023.3293413 (DOI)001167106200022 ()2-s2.0-85164381105 (Scopus ID)
Note

QC 20240502

Available from: 2024-05-02 Created: 2024-05-02 Last updated: 2024-07-04Bibliographically approved
Shahabsamani, F., Hammar, K. & Stadler, R. (2024). Online Policy Adaptation for Networked Systems using Rollout. In: : . Paper presented at IEEE/IFIP Network Operations and Management Symposium 6–10 May 2024, Seoul, South Korea.
Open this publication in new window or tab >>Online Policy Adaptation for Networked Systems using Rollout
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Dynamic resource allocation in networked systems is needed to continuously achieve end-to-end management objectives. Recent research has shown that reinforcement learning can achieve near-optimal resource allocation policies for realistic system configurations. However, most current solutions require expensive retraining when changes in the system occur. We address this problem and introduce an efficient method to adapt a given base policy to system changes, e.g., to a change in the service offering. In our approach, we adapt a base control policy using a rollout mechanism, which transforms the base policy into an improved rollout policy. We perform extensive evaluations on a testbed where we run applications on a service mesh based on the Istio and Kubernetes platforms. The experiments provide insights into the performance of different rollout algorithms. We find that our approach produces policies that are equally effective as those obtained by offline retraining. On our testbed, effective policy adaptation takes seconds when using rollout, compared to minutes or hours when using retraining. Our work demonstrates that rollout, which has been applied successfully in other domains, is an effective approach for policy adaptation in networked systems.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-346582 (URN)
Conference
IEEE/IFIP Network Operations and Management Symposium 6–10 May 2024, Seoul, South Korea
Note

QC 20240522

Available from: 2024-05-18 Created: 2024-05-18 Last updated: 2024-06-10Bibliographically approved
Samani, F. S., Hammar, K. & Stadler, R. (2024). Online Policy Adaptation for Networked Systems using Rollout. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024: . Paper presented at 2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Online Policy Adaptation for Networked Systems using Rollout
2024 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

Dynamic resource allocation in networked systems is needed to continuously achieve end-to-end management objectives. Recent research has shown that reinforcement learning can achieve near-optimal resource allocation policies for realistic system configurations. However, most current solutions require expensive retraining when changes in the system occur. We address this problem and introduce an efficient method to adapt a given base policy to system changes, e.g., to a change in the service offering. In our approach, we adapt a base control policy using a rollout mechanism, which transforms the base policy into an improved rollout policy. We perform extensive evaluations on a testbed where we run applications on a service mesh based on the Istio and Kubernetes platforms. The experiments provide insights into the performance of different rollout algorithms. We find that our approach produces policies that are equally effective as those obtained by offline retraining. On our testbed, effective policy adaptation takes seconds when using rollout, compared to minutes or hours when using retraining. Our work demonstrates that rollout, which has been applied successfully in other domains, is an effective approach for policy adaptation in networked systems.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Istio, Kubernetes, Performance management, policy adaptation, reinforcement learning, rollout, service mesh
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-351011 (URN)10.1109/NOMS59830.2024.10575707 (DOI)001270140300173 ()2-s2.0-85198340187 (Scopus ID)
Conference
2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024
Note

Part of ISBN 9798350327939

QC 20240725

Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2024-09-27Bibliographically approved
Samani, F. S., Hammar, K. & Stadler, R. (2023). Demonstrating a System for Dynamically Meeting Management Objectives on a Service Mesh. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023: . Paper presented at 36th IEEE/IFIP Network Operations and Management Symposium, NOMS 2023, Miami, United States of America, May 8 2023 - May 12 2023. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Demonstrating a System for Dynamically Meeting Management Objectives on a Service Mesh
2023 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper, Published paper (Refereed)
Abstract [en]

We demonstrate a management system that lets a service provider achieve end-to-end management objectives under varying load for applications on a service mesh based on the Istio and Kubernetes platforms. The management objectives for the demonstration include end-to-end delay bounds on service requests, throughput objectives, and service differentiation. Our method for finding effective control policies includes a simulator and a control module. The simulator is instantiated with traces from a testbed, and the control module trains a reinforcement learning (RL) agent to efficiently learn effective control policies on the simulator. The learned policies are then transfered to the testbed to perform dynamic control actions based on monitored system metrics. We show that the learned policies dynamically meet management objectives on the testbed and can be changed on the fly.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
digital twin, Istio, Kubernetes, Performance management, reinforcement learning (RL), service mesh
National Category
Computer Sciences Control Engineering
Identifiers
urn:nbn:se:kth:diva-334446 (URN)10.1109/NOMS56928.2023.10154365 (DOI)2-s2.0-85164731961 (Scopus ID)
Conference
36th IEEE/IFIP Network Operations and Management Symposium, NOMS 2023, Miami, United States of America, May 8 2023 - May 12 2023
Note

Part of ISBN 9781665477161

QC 20230821

Available from: 2023-08-21 Created: 2023-08-21 Last updated: 2024-06-10Bibliographically approved
Hammar, K. & Stadler, R. (2023). Digital Twins for Security Automation. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023: . Paper presented at 36th IEEE/IFIP Network Operations and Management Symposium, NOMS 2023, Miami, United States of America, May 8 2023 - May 12 2023. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Digital Twins for Security Automation
2023 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper, Published paper (Refereed)
Abstract [en]

We present a novel emulation system for creating high-fidelity digital twins of IT infrastructures. The digital twins replicate key functionality of the corresponding infrastructures and allow to play out security scenarios in a safe environment. We show that this capability can be used to automate the process of finding effective security policies for a target infrastructure. In our approach, a digital twin of the target infrastructure is used to run security scenarios and collect data. The collected data is then used to instantiate simulations of Markov decision processes and learn effective policies through reinforcement learning, whose performances are validated in the digital twin. This closed-loop learning process executes iteratively and provides continuously evolving and improving security policies. We apply our approach to an intrusion response scenario. Our results show that the digital twin provides the necessary evaluative feedback to learn near-optimal intrusion response policies.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
automation, bMDP, cybersecurity, Digital twin, network security, POMDP, reinforcement learning
National Category
Computer Systems Information Systems
Identifiers
urn:nbn:se:kth:diva-334449 (URN)10.1109/NOMS56928.2023.10154288 (DOI)2-s2.0-85164728152 (Scopus ID)
Conference
36th IEEE/IFIP Network Operations and Management Symposium, NOMS 2023, Miami, United States of America, May 8 2023 - May 12 2023
Note

Part of ISBN 9781665477161

QC 20230821

Available from: 2023-08-21 Created: 2023-08-21 Last updated: 2023-08-21Bibliographically approved
Hammar, K. & Stadler, R. (2023). Scalable Learning of Intrusion Response Through Recursive Decomposition. In: Decision and Game Theory for Security - 14th International Conference, GameSec 2023, Proceedings: . Paper presented at 14th International Conference on Decision and Game Theory for Security, GameSec 2023, Avignon, France, Oct 18 2023 - Oct 20 2023 (pp. 172-192). Springer Nature
Open this publication in new window or tab >>Scalable Learning of Intrusion Response Through Recursive Decomposition
2023 (English)In: Decision and Game Theory for Security - 14th International Conference, GameSec 2023, Proceedings, Springer Nature , 2023, p. 172-192Conference paper, Published paper (Refereed)
Abstract [en]

We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames with low computational complexity which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (dfsp), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that dfsp significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Cybersecurity, game decomposition, game theory, intrusion response, network security, optimal control, reinforcement learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-342399 (URN)10.1007/978-3-031-50670-3_9 (DOI)001160781000009 ()2-s2.0-85181981948 (Scopus ID)
Conference
14th International Conference on Decision and Game Theory for Security, GameSec 2023, Avignon, France, Oct 18 2023 - Oct 20 2023
Note

QC 20240122

Available from: 2024-01-17 Created: 2024-01-17 Last updated: 2024-03-18Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6039-8493

Search in DiVA

Show all publications