kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Balancing detectability and performance of attacks on the control channel of Markov Decision Processes
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control). (Statistical Learning for Control)ORCID iD: 0000-0001-9083-5260
2022 (English)Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
2022.
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-312061OAI: oai:DiVA.org:kth-312061DiVA, id: diva2:1657247
Conference
American Control Conference
Note

QCR 20220510

Submitted American Control Conference June 8-10, 2022 in Atlanta, Georgia, USA 

Available from: 2022-05-10 Created: 2022-05-10 Last updated: 2022-06-25Bibliographically approved
In thesis
1. Analysis of Attacks on Controlled Stochastic Systems
Open this publication in new window or tab >>Analysis of Attacks on Controlled Stochastic Systems
2022 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In this thesis, we investigate attack vectors against Markov decision processes anddynamical systems. This work is motivated by the recent interest in the researchcommunity towards making Machine Learning models safer to malicious attacks. Wefocus on different attack vectors: (I) attacks that alter the input/output signal of aMarkov decision process; (II) eavesdropping attacks whose aim is to detect a change ina dynamical system; (III) poisoning attacks against data-driven control methods.(I) For attacks on Markov decision processes we focus on 2 types of attacks: (1) attacksthat alter the observations of the victim, and (2) attacks that alter the control signalof the victim. Regarding (1), we investigate the problem of devising optimal attacksthat minimize the collected reward of the victim. We show that when the policy andthe system are known to the attacker, designing optimal attacks amounts to solving aMarkov decision process. We also show that, for the victim, the system uncertaintiesinduced by the attack can be modeled using a Partially Observable Markov decisionprocess (POMDP) framework. We demonstrate that using Reinforcement Learningmethods tailored to POMDP lead to more resilient policies. Regarding (2), we insteadinvestigate the problem of designing optimal stealthy poisoning attacks on the controlchannel of Markov decision processes. Previous work constrained the amplitude ofthe adversarial perturbation, with the hope that this constraint will make the attackimperceptible. However, such constraints do not grant any level of undetectabilityand do not take into account the dynamic nature of the underlying Markov process.To design an optimal stealthy attack, we investigate a new attack formulation, basedon information-theoretical quantities, that considers the objective of minimizing thedetectability of the attack as well as the performance of the controlled process.(II) In the second part of this thesis we analyse the problem where an eavesdropper triesto detect a change in a Markov decision process. These processes may be affected bychanges that need to remain private. We study the problem using theoretical tools fromoptimal detection theory to motivate a definition of online privacy based on the averageamount of information per observation of the underlying stochastic system. We provideways to derive privacy upper-bounds and compute policies that attain a higher privacylevel, concluding with examples and numerical simulations.(III) Lastly, we investigate poisoning attacks against data-driven control methods.Specifically, we analyse how a malicious adversary can slightly poison the data soas to minimize the performance of a controller trained using this data. We show thatidentifying the most impactful attack boils down to solving a bi-level non-convexoptimization problem, and provide theoretical insights on the attack. We present ageneric algorithm finding a local optimum of this problem and illustrate our analysisfor various techniques. Numerical experiments reveal that minimal but well-craftedchanges in the data-set are sufficient to deteriorate the performance of data-drivencontrol methods significantly, and even make the closed-loop system unstable.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2022. p. 103
Series
TRITA-EECS-AVL ; 2022:34
Keywords
reinforcement learning, markov decision process, attack, detection, data poisoning, online learning
National Category
Control Engineering
Research subject
Computer Science; Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-312089 (URN)978-91-8040-231-6 (ISBN)
Presentation
2022-05-31, Nyquist Room, Malvinas Väg 10, Stockholm, 16:00 (English)
Opponent
Supervisors
Note

QC 20220510

Topic: Alessio Russo - LicentiateTime: May 31, 2022 04:00 PM Madrid

 Zoom Meeting link https://kth-se.zoom.us/j/69452765598

Available from: 2022-05-10 Created: 2022-05-10 Last updated: 2022-09-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Russo, Alessio

Search in DiVA

By author/editor
Russo, Alessio
By organisation
Decision and Control Systems (Automatic Control)
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 164 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf