Coordinated control of Flexible AC Transmission Systems (FACTS) setpoints can significantly enhance power flow and voltage control. However, optimizing the setpoints of multiple FACTS devices in real-world systems remains uncommon, partly due to challenges in model-based control. Data-driven approaches, such as reinforcement learning (RL), offer a promising alternative for coordinated control. In this work, we address a setting where a useful real-time network model is unavailable. Recognizing the increasing deployment of Phasor Measurement Units (PMUs) for advanced monitoring and control, we consider having access to a few but reliable measurements and a constraint violation signal. Under these assumptions, we demonstrate on several scenarios on the IEEE 14-bus and IEEE 57-bus systems that an RL-based optimization of FACTS setpoints can substantially reduce voltage deviations compared to a fixed-setpoint baseline. To improve robustness and prevent unobserved constraint violations, we show that a complete, albeit simple, constraint violation signal is necessary. As an alternative to relying on such a signal, Dynamic Mode Decomposition is proposed to determine new PMU placements, thereby reducing the risk of unobserved constraint violations. Finally, to assess the gap to an optimal policy, we benchmark the RL-based agent against a model-based optimal controller with perfect information.
QC 20260127