kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A phased robotic assembly policy based on a PL-LSTM-SAC algorithm
School of Mechano-Electronic Engineering, Xidian University, Xi'an, Shanxi 710071, China, Shanxi.
School of Mechano-Electronic Engineering, Xidian University, Xi'an, Shanxi 710071, China, Shanxi.
School of Mechano-Electronic Engineering, Xidian University, Xi'an, Shanxi 710071, China, Shanxi.
School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China.
Show others and affiliations
2025 (English)In: Journal of manufacturing systems, ISSN 0278-6125, E-ISSN 1878-6642, Vol. 78, p. 351-369Article in journal (Refereed) Published
Abstract [en]

In order to address the problems with current robotic automated assembly such as limitations of model-based methods in unstructured assembly scenarios, low training efficiency of learning-based methods, and limited performance of policy generalization, this paper proposes two modeling methodologies based on deep reinforcement learning under the overall framework of phased assembly for complex robotic assembly tasks, i.e., separated-phased policy modeling (SPM) and integrated policy modeling (IPM). Regarding policy learning with deep reinforcement learning, we present a refined SAC algorithm that merges a policy-lead mechanism and an LSTM network (i.e., PL-LSTM-SAC). A comprehensive testbed based on the assembly of a triple-task planetary gear train is designed to validate the framework and the proposed approach. Experimental results indicate that the trained assembly policies for each task are effective under both policy modeling methodologies, but SPM has higher stability and policy convergence efficiency than IPM. Physical tests indicate the sim-to-real transferability of the trained policies with both SPM and IPM and an average success rate of 92.0 % is achieved. The PL-LSTM-SAC algorithm proposed can significantly accelerate training speed and enhance compliance and overall performance of assembly actions by a 13.9 % reduction in the average contact force during assembly processes.

Place, publisher, year, edition, pages
Elsevier BV , 2025. Vol. 78, p. 351-369
Keywords [en]
Deep reinforcement learning, LSTM network, Policy-lead mechanism, Robot learning, Robotic assembly
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-358235DOI: 10.1016/j.jmsy.2024.12.008ISI: 001403370200001Scopus ID: 2-s2.0-85212823822OAI: oai:DiVA.org:kth-358235DiVA, id: diva2:1924869
Note

QC 20250114

Available from: 2025-01-07 Created: 2025-01-07 Last updated: 2025-12-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Wang, Lihui

Search in DiVA

By author/editor
Wang, Lihui
By organisation
Industrial Production Systems
In the same journal
Journal of manufacturing systems
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 176 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf