kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning Near-Optimal Intrusion Responses Against Dynamic Attackers
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Network and Systems Engineering. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Cyber Defence and Information Security CDIS.ORCID iD: 0000-0003-1773-8354
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Network and Systems Engineering. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Cyber Defence and Information Security CDIS.ORCID iD: 0000-0001-6039-8493
2024 (English)In: IEEE Transactions on Network and Service Management, E-ISSN 1932-4537, Vol. 21, no 1, p. 1158-1177Article in journal (Refereed) Published
Abstract [en]

We study automated intrusion response and formulate the interaction between an attacker and a defender as an optimal stopping game where attack and defense strategies evolve through reinforcement learning and self-play. The game-theoretic modeling enables us to find defender strategies that are effective against a dynamic attacker, i.e., an attacker that adapts its strategy in response to the defender strategy. Further, the optimal stopping formulation allows us to prove that best response strategies have threshold properties. To obtain near-optimal defender strategies, we develop Threshold Fictitious Self-Play (T-FP), a fictitious self-play algorithm that learns Nash equilibria through stochastic approximation. We show that T-FP outperforms a state-of-the-art algorithm for our use case. The experimental part of this investigation includes two systems: a simulation system where defender strategies are incrementally learned and an emulation system where statistics are collected that drive simulation runs and where learned strategies are evaluated. We argue that this approach can produce effective defender strategies for a practical IT infrastructure.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. Vol. 21, no 1, p. 1158-1177
Keywords [en]
Games, Security, Emulation, Reinforcement learning, Observability, Logic gates, History, Cybersecurity, network security, automated security, intrusion response, optimal stopping, Dynkin games, game theory, Markov decision process, MDP, POMDP
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-345922DOI: 10.1109/TNSM.2023.3293413ISI: 001167106200022Scopus ID: 2-s2.0-85164381105OAI: oai:DiVA.org:kth-345922DiVA, id: diva2:1855634
Note

QC 20240502

Available from: 2024-05-02 Created: 2024-05-02 Last updated: 2024-07-04Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Hammar, KimStadler, Rolf

Search in DiVA

By author/editor
Hammar, KimStadler, Rolf
By organisation
Network and Systems EngineeringCentre for Cyber Defence and Information Security CDIS
In the same journal
IEEE Transactions on Network and Service Management
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 134 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf