kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reducing the Learning Time of Reinforcement Learning for the Supervisory Control of Discrete Event Systems
School of Electro-Mechanical Engineering, Xidian University, Xi’an, China.ORCID iD: 0000-0002-5988-0335
KTH, School of Industrial Engineering and Management (ITM), Engineering Design, Mechatronics and Embedded Control Systems.ORCID iD: 0000-0003-4535-3849
KTH, School of Industrial Engineering and Management (ITM), Engineering Design, Mechatronics and Embedded Control Systems.ORCID iD: 0000-0001-5703-5923
Department of Industrial Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia.ORCID iD: 0000-0003-3559-6249
Show others and affiliations
2023 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 11, p. 59840-59853Article in journal (Refereed) Published
Abstract [en]

Reinforcement learning (RL) can obtain the supervisory controller for discrete-event systems modeled by finite automata and temporal logic. The published methods often have two limitations. First, a large number of training data are required to learn the RL controller. Second, the RL algorithms do not consider uncontrollable events, which are essential for supervisory control theory (SCT). To address the limitations, we first apply SCT to find the supervisors for the specifications modeled by automata. These supervisors remove illegal training data violating these specifications and hence reduce the exploration space of the RL algorithm. For the remaining specifications modeled by temporal logic, the RL algorithm is applied to search for the optimal control decision within the confined exploration space. Uncontrollable events are considered by the RL algorithm as uncertainties in the plant model. The proposed method can obtain a nonblocking supervisor for all specifications with less learning time than the published methods.

Place, publisher, year, edition, pages
IEEE, 2023. Vol. 11, p. 59840-59853
Keywords [en]
Discrete event system, linear temporal logic, supervisory control theory, reinforcement learning
National Category
Control Engineering
Research subject
Applied and Computational Mathematics, Optimization and Systems Theory; Computer Science; Industrial Information and Control Systems
Identifiers
URN: urn:nbn:se:kth:diva-330695DOI: 10.1109/access.2023.3285432ISI: 001018594800001Scopus ID: 2-s2.0-85163172875OAI: oai:DiVA.org:kth-330695DiVA, id: diva2:1778207
Projects
XPRES
Funder
XPRES - Initiative for excellence in production research
Note

QC 20230704

Available from: 2023-06-30 Created: 2023-06-30 Last updated: 2023-07-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopushttps://ieeexplore.ieee.org/abstract/document/10149832/authors#authors

Authority records

Tan, KaigeFeng, Lei

Search in DiVA

By author/editor
Yang, JunjunTan, KaigeFeng, LeiEl-Sherbeeny, Ahmed M.Li, Zhiwu
By organisation
Mechatronics and Embedded Control Systems
In the same journal
IEEE Access
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf