kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Offline to Online Reinforcement Learning for Optimizing FACTS Setpoints
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. Hitachi Energy, 72212 Västerås, Sweden.ORCID iD: 0000-0002-3138-9915
Hitachi Energy, 5405 Baden-Dättwil, Switzerland.ORCID iD: 0000-0001-5423-2550
Hitachi Energy, 72212 Västerås, Sweden.ORCID iD: 0009-0000-1964-2458
KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electric Power and Energy Systems.ORCID iD: 0000-0003-3014-5609
Show others and affiliations
2025 (English)In: Sustainable Energy, Grids and Networks, E-ISSN 2352-4677, Vol. 43, article id 101826Article in journal (Refereed) Published
Abstract [en]

With the growing electrification and integration of renewables, network operators face unprecedented challenges. Coordinated control of Flexible AC Transmission Systems (FACTS) setpoints using real-time optimization techniques has been proposed to substantially improve voltage and power flow control. However, optimizing the setpoints of several FACTS devices is rarely done in practice. In part, this can be derived from the challenges with model-based methods. As alternative control methods, data-driven methods based on reinforcement learning (RL) have shown great promise. However, RL has its own challenges that include data and safety during learning. Motivated by the increasing collection of data, we study an RL-based optimization of FACTS setpoints and how datasets can be leveraged for pre-training to improve safety. We demonstrate on the IEEE 14-bus and IEEE 57-bus systems that an offline to online RL algorithm can significantly reduce voltage deviations and constraint violations. The performance is compared against an RL agent learning from scratch and the original control policy that generated the dataset. Moreover, our analysis shows that dataset coverage and the amount of pre-training updates affect the performance considerably. Finally, to identify the gap to an optimal policy, the proposed approach is benchmarked against an optimal controller with perfect information.

Place, publisher, year, edition, pages
Elsevier BV , 2025. Vol. 43, article id 101826
Keywords [en]
Decision support systems, Flexible AC Transmission Systems (FACTS), power system control, reinforcement learning
National Category
Computer Sciences
Research subject
Computer Science; Electrical Engineering
Identifiers
URN: urn:nbn:se:kth:diva-365883DOI: 10.1016/j.segan.2025.101826ISI: 001550489000016Scopus ID: 2-s2.0-105012372136OAI: oai:DiVA.org:kth-365883DiVA, id: diva2:1979922
Conference
Bulk Power System Dynamics and Control - XII, June 2025, Sorrento, Italy
Funder
Swedish Foundation for Strategic Research, ID19-0058
Note

QC 20250916

Available from: 2025-07-01 Created: 2025-07-01 Last updated: 2025-09-16Bibliographically approved
In thesis
1. Coordinated Control of FACTS Setpoints Using Reinforcement Learning
Open this publication in new window or tab >>Coordinated Control of FACTS Setpoints Using Reinforcement Learning
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

With the increasing electrification and integration of renewables, power system operators face severe control challenges. These challenges include voltage stability, faster dynamics, and congestion management. Potential solutions encompass more advanced control systems and accurate measurements. One encouraging mitigation strategy is coordinated control of Flexible AC Transmission Systems (FACTS) setpoints to substantially improve voltage and power flow control. However, due to model-based optimization challenges related to e.g. imperfect models and uncertainty, fixed setpoints are often used in practice. Alternative promising control methods are data-driven methods based on, for example, reinforcement learning (RL). Motivated by these challenges, the accumulation of high-quality data, and the advancements in RL, this thesis explores an RL-based coordinated control of FACTS setpoints. With a focus on safety, four problem settings are investigated on the IEEE 14-bus and IEEE 57-bus systems addressing limited pre-training, model errors, few measurements, and datasets for pre-training. First, we propose WMAP, a model-based RL algorithm that learns and uses a compressed dynamics model to optimize voltage and current setpoints. WMAP includes a mechanism to mitigate poor performance in case of out-of-distribution data. Moreover, WMAP is shown to outperform model-free RL and a non-frequently updated expert policy. Second, when power system model errors are present, safe RL is demonstrated to outperform classical model-based optimization in terms of constraint satisfaction. Third, RL is shown to exceed the performance of fixed setpoints using a few measurements provided it has a complete, albeit simple, constraint signal. Finally, RL that leverages datasets for offline pre-training is demonstrated to outperform the original policy that generated the dataset and an RL agent trained from scratch. Overall, these four works contribute to an advancement in the field towards a more adaptable and sustainable power system.    

Abstract [sv]

Med den ökande elektrifieringen och integrationen av förnybar energi står elnätsoperatörer inför stora reglerlutmaningar. Dessa utmaningar inkluderar spänningsstabilitet, snabbare dynamik och hantering av överlaster. Potentiella lösningar innefattar mer avancerade styrsystem och noggranna mätningar. En lovande strategi för att delvis hantera dessa problem är koordinerad styrning av referensvärden för Flexible AC Transmission Systems (FACTS), vilket kan förbättra spännings- och effektflödesregleringen avsevärt. I praktiken används dock ofta konstanta referensvärden, till följd av optimeringssvårigheter kopplade till exempelvis osäkerhet och modellfel. Ett alternativ med stor potential är datadrivna metoder baserade på exempelvis förstärkande inlärning (reinforcement learning, RL). Mot bakgrund av dessa utmaningar, tillgången till högkvalitativ data samt framstegen inom RL, undersöker denna avhandling en RL-baserad koordinerad styrning av referensvärden för FACTS. Med fokus på säkerhet undersöks fyra problemställningar på IEEE:s 14-nods- och 57-nodssystem, med hänsyn till begränsad förträning, modellfel, få mätvärden samt användning av dataset för förträning. För det första föreslår vi WMAP, en modellbaserad RL-algoritm som lär sig och använder en komprimerad dynamikmodell för att optimera spännings- och strömreferenser. WMAP innehåller en mekanism för att mildra sämre prestanda vid data utanför träningsförhållandena. WMAP visas överträffa modellfri RL och en expertpolicy som uppdateras sällan. För det andra, när modellfel förekommer i kraftsystemet, visar vi att säker RL uppnår bättre måluppfyllelse än klassisk modellbaserad optimering. För det tredje visar vi att RL kan prestera bättre än fasta referensvärden med hjälp av ett fåtal mätvärden, förutsatt att den har tillgång till en komplett, om än enkel, constraint-signal. Slutligen visar vi att RL som använder dataset för offline-förträning kan överträffa både den ursprungliga policy som genererat datasetet och en RL-agent tränad från grunden. Sammantaget bidrar dessa fyra arbeten till framsteg inom området mot ett mer anpassningsbart och hållbart elsystem.

Place, publisher, year, edition, pages
Stocholm, Sweden: KTH Royal Institute of Technology, 2025. p. xviii, 103
Series
TRITA-EECS-AVL ; 2025:80
Keywords
Decision support systems, Flexible AC Transmission Systems (FACTS), power system control, reinforcement learning
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-369519 (URN)978-91-8106-387-5 (ISBN)
Public defence
2025-10-08, https://kth-se.zoom.us/j/65901664759, F3 (Flodis), Lindstedtsvägen 26 & 28, Stockholm, 13:00 (English)
Opponent
Supervisors
Funder
Swedish Foundation for Strategic Research, ID19-0058
Note

QC 20250908

Available from: 2025-09-08 Created: 2025-09-08 Last updated: 2025-10-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Tarle, MagnusNordström, LarsBjörkman, Mårten

Search in DiVA

By author/editor
Tarle, MagnusLarsson, MatsIngeström, GunnarNordström, LarsBjörkman, Mårten
By organisation
Robotics, Perception and Learning, RPLElectric Power and Energy Systems
In the same journal
Sustainable Energy, Grids and Networks
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 169 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf