kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Stadler, Rolf, Prof.ORCID iD iconorcid.org/0000-0001-6039-8493
Alternative names
Publications (10 of 82) Show all publications
Hammar, K., Li, T., Stadler, R. & Zhu, Q. (2025). Adaptive Security Response Strategies Through Conjectural Online Learning. IEEE Transactions on Information Forensics and Security, 20, 4055-4070
Open this publication in new window or tab >>Adaptive Security Response Strategies Through Conjectural Online Learning
2025 (English)In: IEEE Transactions on Information Forensics and Security, ISSN 1556-6013, E-ISSN 1556-6021, Vol. 20, p. 4055-4070Article in journal (Refereed) Published
Abstract [en]

We study the problem of learning adaptive security response strategies for an IT infrastructure. We formulate the interaction between an attacker and a defender as a partially observed, non-stationary game. We relax the standard assumption that the game model is correctly specified and consider that each player has a probabilistic conjecture about the model, which may be misspecified in the sense that the true model has probability 0. This formulation allows us to capture uncertainty and misconception about the infrastructure and the intents of the players. To learn effective game strategies online, we design Conjectural Online Learning (COL), a novel method where a player iteratively adapts its conjecture using Bayesian learning and updates its strategy through rollout. We prove that the conjectures converge to best fits, and we provide a bound on the performance improvement that rollout enables with a conjectured model. To characterize the steady state of the game, we propose a variant of the Berk-Nash equilibrium. We present COL through an intrusion response use case. Testbed evaluations show that COL produces effective security strategies that adapt to a changing environment. We also find that COL enables faster convergence than current reinforcement learning techniques.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Bayesian learning, Berk-Nash equilibrium, Cybersecurity, game theory, network security, rollout
National Category
Computer Sciences Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-363203 (URN)10.1109/TIFS.2025.3558600 (DOI)001473091500004 ()2-s2.0-105003490797 (Scopus ID)
Note

QC 20250609

Available from: 2025-05-07 Created: 2025-05-07 Last updated: 2025-06-09Bibliographically approved
Hammar, K. & Stadler, R. (2025). Intrusion Tolerance as a Two-Level Game. In: Decision and Game Theory for Security - 15th International Conference, GameSec 2024, Proceedings: . Paper presented at 15th International Conference on Decision and Game Theory for Security, GameSec 2024, October 16-18, 2024, New York, United States of America (pp. 3-23). Springer Nature
Open this publication in new window or tab >>Intrusion Tolerance as a Two-Level Game
2025 (English)In: Decision and Game Theory for Security - 15th International Conference, GameSec 2024, Proceedings, Springer Nature , 2025, p. 3-23Conference paper, Published paper (Refereed)
Abstract [en]

We formulate intrusion tolerance for a system with service replicas as a two-level game: a local game models intrusion recovery and a global game models replication control. For both games, we prove the existence of equilibria and show that the best responses have a threshold structure, which enables efficient computation of strategies. State-of-the-art intrusion-tolerant systems can be understood as instantiations of our game with heuristic control strategies. Our analysis shows the conditions under which such heuristics can be significantly improved through game-theoretic reasoning. This reasoning allows us to derive the optimal control strategies and evaluate them against 10 types of network intrusions on a testbed. The testbed results demonstrate that our game-theoretic strategies can significantly improve service availability and reduce the operational cost of state-of-the-art intrusion-tolerant systems. In addition, our game strategies can ensure any chosen level of service availability and time-to-recovery, bridging the gap between theoretical and operational performance.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
bft, Cybersecurity, game theory, intrusion tolerance, network security, optimal control, reliability theory
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-355925 (URN)10.1007/978-3-031-74835-6_1 (DOI)001416979800001 ()2-s2.0-85207655805 (Scopus ID)
Conference
15th International Conference on Decision and Game Theory for Security, GameSec 2024, October 16-18, 2024, New York, United States of America
Note

Part of ISBN 9783031748349

QC 20241106

Available from: 2024-11-06 Created: 2024-11-06 Last updated: 2025-03-17Bibliographically approved
Samani, F. S. & Stadler, R. (2024). A Framework for dynamically meeting performance objectives on a service mesh.
Open this publication in new window or tab >>A Framework for dynamically meeting performance objectives on a service mesh
2024 (English)Manuscript (preprint) (Other academic)
Abstract [en]

We present a framework for achieving end-to-end management objectives for multiple services that concurrently execute on a service mesh. We apply reinforcement learning (RL) techniques to train an agent that periodically performs control actions to reallocate resources. We develop and evaluate the framework using a laboratory testbed where we run information and computing services on a service mesh, supported by the Istio and Kubernetes platforms. We investigate different management objectives that include end-to-end delay bounds on service requests, throughput objectives, cost-related objectives, and service differentiation. Our framework supports the design of a control agent for a given management objective. It is novel in that it advocates a top-down approach whereby the management objective is defined first and then mapped onto the available control actions. Several types of control actions can be executed simultaneously, which allows for efficient resource utilization. Second, the framework separates learning of the system model and the operating region from learning of the control policy. By first learning the system model and the operating region from testbed traces, we can instantiate a simulator and train the agent for different management objectives in parallel. Third, the use of a simulator shortens the training time by orders of magnitude compared with training the agent on the testbed. We evaluate the learned policies on the testbed and show the effectiveness of our approach in several scenarios. In one scenario, we design a controller that achieves the management objectives with $50\%$ less system resources than Kubernetes HPA autoscaling.

National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:kth:diva-346583 (URN)
Note

QC 20240522

Available from: 2024-05-18 Created: 2024-05-18 Last updated: 2024-05-22Bibliographically approved
Samani, F. S. & Stadler, R. (2024). A Framework for Dynamically Meeting Performance Objectives on a Service Mesh. IEEE Transactions on Network and Service Management, 21(6), 5992-6007
Open this publication in new window or tab >>A Framework for Dynamically Meeting Performance Objectives on a Service Mesh
2024 (English)In: IEEE Transactions on Network and Service Management, E-ISSN 1932-4537, Vol. 21, no 6, p. 5992-6007Article in journal (Refereed) Published
Abstract [en]

We present a framework for achieving end-to-end management objectives for multiple services that concurrently execute on a service mesh. We apply reinforcement learning (RL) techniques to train an agent that periodically performs control actions to reallocate resources. We develop and evaluate the framework using a laboratory testbed where we run information and computing services on a service mesh, supported by the Istio and Kubernetes platforms. We investigate different management objectives that include end-to-end delay bounds on service requests, throughput objectives, cost-related objectives, and service differentiation. Our framework supports the design of a control agent for a given management objective. The management objective is defined first and then mapped onto available control actions. Several types of control actions can be executed simultaneously, which allows for efficient resource utilization. Second, the framework separates the learning of the system model and the operating region from the learning of the control policy. By first learning the system model and the operating region from testbed traces, we can instantiate a simulator and train the agent for different management objectives. Third, the use of a simulator shortens the training time by orders of magnitude compared with training the agent on the testbed. We evaluate the learned policies on the testbed and show the effectiveness of our approach in several scenarios. In one scenario, we design a controller that achieves the management objectives with 50% less system resources than Kubernetes HPA autoscaling.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Microservice architectures, Measurement, Training, Reinforcement learning, Delays, Resource management, Throughput, Performance management, adaptive resource allocation, microservice, operating region
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-359479 (URN)10.1109/TNSM.2024.3434328 (DOI)001381366600033 ()2-s2.0-85199569083 (Scopus ID)
Note

Not duplicate with DiVA 1858751

QC 20250206

Available from: 2025-02-06 Created: 2025-02-06 Last updated: 2025-02-06Bibliographically approved
Samani, F. S., Larsson, H., Damberg, S., Johnsson, A. & Stadler, R. (2024). Comparing Transfer Learning and Rollout for Policy Adaptation in a Changing Network Environment. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024: . Paper presented at 2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Comparing Transfer Learning and Rollout for Policy Adaptation in a Changing Network Environment
Show others...
2024 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

Dynamic resource allocation for network services is pivotal for achieving end-to-end management objectives. Previous research has demonstrated that Reinforcement Learning (RL) is a promising approach to resource allocation in networks, allowing to obtain near-optimal control policies for non-trivial system configurations. Current RL approaches however have the drawback that a change in the system or the management objective necessitates expensive retraining of the RL agent. To tackle this challenge, practical solutions including offline retraining, transfer learning, and model-based rollout have been proposed. In this work, we study these methods and present comparative results that shed light on their respective performance and benefits. Our study finds that rollout achieves faster adaptation than transfer learning, yet its effectiveness highly depends on the accuracy of the system model.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Istio, Kubernetes, Performance management, policy adaptation, reinforcement learning, rollout, service mesh
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-351010 (URN)10.1109/NOMS59830.2024.10575398 (DOI)001270140300103 ()2-s2.0-85198375028 (Scopus ID)
Conference
2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024
Note

Part of ISBN 9798350327939

QC 20240725

Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2024-09-27Bibliographically approved
Li, T., Hammar, K., Stadler, R. & Zhu, Q. (2024). Conjectural Online Learning with First-order Beliefs in Asymmetric Information Stochastic Games. In: 2024 IEEE 63rd Conference on Decision and Control, CDC 2024: . Paper presented at 63rd IEEE Conference on Decision and Control, CDC 2024, Milan, Italy, December 16-19, 2024 (pp. 6780-6785). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Conjectural Online Learning with First-order Beliefs in Asymmetric Information Stochastic Games
2024 (English)In: 2024 IEEE 63rd Conference on Decision and Control, CDC 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 6780-6785Conference paper, Published paper (Refereed)
Abstract [en]

Asymmetric information stochastic games (AISGs) arise in many complex socio-technical systems, such as cyberphysical systems and IT infrastructures. Existing computational methods for AISGs are primarily offline and can not adapt to equilibrium deviations. Further, current methods are limited to particular information structures to avoid belief hierarchies. Considering these limitations, we propose conjectural online learning (COL), an online learning method under generic information structures in AISGs. COL uses a forecaster-actorcritic (FAC) architecture, where subjective forecasts are used to conjecture the opponents' strategies within a lookahead horizon, and Bayesian learning is used to calibrate the conjectures. To adapt strategies to nonstationary environments based on information feedback, COL uses online rollout with cost function approximation (actor-critic). We prove that the conjectures produced by COL are asymptotically consistent with the information feedback in the sense of a relaxed Bayesian consistency. We also prove that the empirical strategy profile induced by COL converges to the Berk-Nash equilibrium, a solution concept characterizing rationality under subjectivity. Experimental results from an intrusion response use case demonstrate COL's faster convergence over state-of-the-art reinforcement learning methods against nonstationary attacks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-361743 (URN)10.1109/CDC56724.2024.10886479 (DOI)001445827205093 ()2-s2.0-86000618322 (Scopus ID)
Conference
63rd IEEE Conference on Decision and Control, CDC 2024, Milan, Italy, December 16-19, 2024
Note

Part of ISBN 9798350316339

QC 20250328

Available from: 2025-03-27 Created: 2025-03-27 Last updated: 2025-12-05Bibliographically approved
Hammar, K. & Stadler, R. (2024). Intrusion tolerance for networked systems through two-level feedback control. In: Proceedings - 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024: . Paper presented at 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024, June 24-27 2024, Brisbane, Australia (pp. 338-352). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Intrusion tolerance for networked systems through two-level feedback control
2024 (English)In: Proceedings - 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 338-352Conference paper, Published paper (Refereed)
Abstract [en]

We formulate intrusion tolerance for a system with service replicas as a two-level optimal control problem. On the local level node controllers perform intrusion recovery, and on the global level a system controller manages the replication factor. The local and global control problems can be formulated as classical problems in operations research, namely, the machine replacement problem and the inventory replenishment problem. Based on this formulation, we design TOLERANCE, a novel control architecture for intrusion-tolerant systems. We prove that the optimal control strategies on both levels have threshold structure and design efficient algorithms for computing them. We implement and evaluate TOLERANCE in an emulation environment where we run 10 types of network intrusions. The results show that TOLERANCE can improve service availability and reduce operational cost compared with state-of-the-art intrusion-tolerant systems.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
BFT, Byzantine fault tolerance, CMDP, intrusion recovery, Intrusion tolerance, MDP, optimal control, POMDP
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-353946 (URN)10.1109/DSN58291.2024.00042 (DOI)001313667600025 ()2-s2.0-85203812073 (Scopus ID)
Conference
54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2024, June 24-27 2024, Brisbane, Australia
Note

Part of ISBN: 979-8-3503-4105-8

QC 20241111

Available from: 2024-09-25 Created: 2024-09-25 Last updated: 2024-11-11Bibliographically approved
Wang, X. & Stadler, R. (2024). IT Intrusion Detection Using Statistical Learning and Testbed Measurements. In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024: . Paper presented at 2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>IT Intrusion Detection Using Statistical Learning and Testbed Measurements
2024 (English)In: Proceedings of IEEE/IFIP Network Operations and Management Symposium 2024, NOMS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

We study automated intrusion detection in an IT infrastructure, specifically the problem of identifying the start of an attack, the type of attack, and the sequence of actions an attacker takes, based on continuous measurements from the infrastructure. We apply statistical learning methods, including Hidden Markov Model (HMM), Long Short-Term Memory (LSTM), and Random Forest Classifier (RFC) to map sequences of observations to sequences of predicted attack actions. In contrast to most related research, we have abundant data to train the models and evaluate their predictive power. The data comes from traces we generate on an in-house testbed where we run attacks against an emulated IT infrastructure. Central to our work is a machine-learning pipeline that maps measurements from a high-dimensional observation space to a space of low dimensionality or to a small set of observation symbols. Investigating intrusions in offline as well as online scenarios, we find that both HMM and LSTM can be effective in predicting attack start time, attack type, and attack actions. If sufficient training data is available, LSTM achieves higher prediction accuracy than HMM. HMM, on the other hand, requires less computational resources and less training data for effective prediction. Also, we find that the methods we study benefit from data produced by traditional intrusion detection systems like SNORT.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
automated security, forensics, Hidden Markov Model, intrusion detection, Long Short-Term Memory, SNORT
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-351006 (URN)10.1109/NOMS59830.2024.10575087 (DOI)001270140300036 ()2-s2.0-85198353664 (Scopus ID)
Conference
2024 IEEE/IFIP Network Operations and Management Symposium, NOMS 2024, Seoul, Korea, May 6 2024 - May 10 2024
Note

QC 20241007

Part of ISBN 9798350327939

Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2024-10-07Bibliographically approved
Hammar, K. & Stadler, R. (2024). Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. IEEE Transactions on Network and Service Management, 21(1), 1158-1177
Open this publication in new window or tab >>Learning Near-Optimal Intrusion Responses Against Dynamic Attackers
2024 (English)In: IEEE Transactions on Network and Service Management, E-ISSN 1932-4537, Vol. 21, no 1, p. 1158-1177Article in journal (Refereed) Published
Abstract [en]

We study automated intrusion response and formulate the interaction between an attacker and a defender as an optimal stopping game where attack and defense strategies evolve through reinforcement learning and self-play. The game-theoretic modeling enables us to find defender strategies that are effective against a dynamic attacker, i.e., an attacker that adapts its strategy in response to the defender strategy. Further, the optimal stopping formulation allows us to prove that best response strategies have threshold properties. To obtain near-optimal defender strategies, we develop Threshold Fictitious Self-Play (T-FP), a fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that T-FP outperforms a state-of-the-art algorithm for our use case. The experimental part of this investigation includes two systems: a simulation system where defender strategies are incrementally learned and an emulation system where statistics are collected that drive simulation runs and where learned strategies are evaluated. We argue that this approach can produce effective defender strategies for a practical IT infrastructure.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Games, Security, Emulation, Reinforcement learning, Observability, Logic gates, History, Cybersecurity, network security, automated security, intrusion response, optimal stopping, Dynkin games, game theory, Markov decision process, MDP, POMDP
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-345922 (URN)10.1109/TNSM.2023.3293413 (DOI)001167106200022 ()2-s2.0-85164381105 (Scopus ID)
Note

QC 20240502

Available from: 2024-05-02 Created: 2024-05-02 Last updated: 2024-07-04Bibliographically approved
Shahabsamani, F., Hammar, K. & Stadler, R. (2024). Online Policy Adaptation for Networked Systems using Rollout. In: : . Paper presented at IEEE/IFIP Network Operations and Management Symposium 6–10 May 2024, Seoul, South Korea.
Open this publication in new window or tab >>Online Policy Adaptation for Networked Systems using Rollout
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Dynamic resource allocation in networked systems is needed to continuously achieve end-to-end management objectives. Recent research has shown that reinforcement learning can achieve near-optimal resource allocation policies for realistic system configurations. However, most current solutions require expensive retraining when changes in the system occur. We address this problem and introduce an efficient method to adapt a given base policy to system changes, e.g., to a change in the service offering. In our approach, we adapt a base control policy using a rollout mechanism, which transforms the base policy into an improved rollout policy. We perform extensive evaluations on a testbed where we run applications on a service mesh based on the Istio and Kubernetes platforms. The experiments provide insights into the performance of different rollout algorithms. We find that our approach produces policies that are equally effective as those obtained by offline retraining. On our testbed, effective policy adaptation takes seconds when using rollout, compared to minutes or hours when using retraining. Our work demonstrates that rollout, which has been applied successfully in other domains, is an effective approach for policy adaptation in networked systems.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-346582 (URN)
Conference
IEEE/IFIP Network Operations and Management Symposium 6–10 May 2024, Seoul, South Korea
Note

QC 20240522

Available from: 2024-05-18 Created: 2024-05-18 Last updated: 2024-06-10Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6039-8493

Search in DiVA

Show all publications