TU Darmstadt / ULB / TUbiblio

Reinforcement learning of motor skills using Policy Search and human corrective advice

Celemin, Carlos ; Maeda, Guilherme ; Ruiz-del-Solar, Javier ; Peters, Jan ; Kober, Jens (2024)
Reinforcement learning of motor skills using Policy Search and human corrective advice.
In: The International Journal of Robotics Research, 2019, 38 (14)
doi: 10.26083/tuprints-00016981
Artikel, Zweitveröffentlichung, Verlagsversion

WarnungEs ist eine neuere Version dieses Eintrags verfügbar.

Kurzbeschreibung (Abstract)

Robot learning problems are limited by physical constraints, which make learning successful policies for complex motor skills on real systems unfeasible. Some reinforcement learning methods, like Policy Search, offer stable convergence toward locally optimal solutions, whereas interactive machine learning or learning-from-demonstration methods allow fast transfer of human knowledge to the agents. However, most methods require expert demonstrations. In this work, we propose the use of human corrective advice in the actions domain for learning motor trajectories. Additionally, we combine this human feedback with reward functions in a Policy Search learning scheme. The use of both sources of information speeds up the learning process, since the intuitive knowledge of the human teacher can be easily transferred to the agent, while the Policy Search method with the cost/reward function take over for supervising the process and reducing the influence of occasional wrong human corrections. This interactive approach has been validated for learning movement primitives with simulated arms with several degrees of freedom in reaching via-point movements, and also using real robots in such tasks as “writing characters” and the ball-in-a-cup game. Compared with standard reinforcement learning without human advice, the results show that the proposed method not only converges to higher rewards when learning movement primitives, but also that the learning is sped up by a factor of 4–40 times, depending on the task.

Typ des Eintrags: Artikel
Erschienen: 2024
Autor(en): Celemin, Carlos ; Maeda, Guilherme ; Ruiz-del-Solar, Javier ; Peters, Jan ; Kober, Jens
Art des Eintrags: Zweitveröffentlichung
Titel: Reinforcement learning of motor skills using Policy Search and human corrective advice
Sprache: Englisch
Publikationsjahr: 21 Mai 2024
Ort: Darmstadt
Publikationsdatum der Erstveröffentlichung: 2019
Ort der Erstveröffentlichung: Thousand Oaks, California, USA
Verlag: SAGE Publications
Titel der Zeitschrift, Zeitung oder Schriftenreihe: The International Journal of Robotics Research
Jahrgang/Volume einer Zeitschrift: 38
(Heft-)Nummer: 14
DOI: 10.26083/tuprints-00016981
URL / URN: https://tuprints.ulb.tu-darmstadt.de/16981
Zugehörige Links:
Herkunft: Zweitveröffentlichung DeepGreen
Kurzbeschreibung (Abstract):

Robot learning problems are limited by physical constraints, which make learning successful policies for complex motor skills on real systems unfeasible. Some reinforcement learning methods, like Policy Search, offer stable convergence toward locally optimal solutions, whereas interactive machine learning or learning-from-demonstration methods allow fast transfer of human knowledge to the agents. However, most methods require expert demonstrations. In this work, we propose the use of human corrective advice in the actions domain for learning motor trajectories. Additionally, we combine this human feedback with reward functions in a Policy Search learning scheme. The use of both sources of information speeds up the learning process, since the intuitive knowledge of the human teacher can be easily transferred to the agent, while the Policy Search method with the cost/reward function take over for supervising the process and reducing the influence of occasional wrong human corrections. This interactive approach has been validated for learning movement primitives with simulated arms with several degrees of freedom in reaching via-point movements, and also using real robots in such tasks as “writing characters” and the ball-in-a-cup game. Compared with standard reinforcement learning without human advice, the results show that the proposed method not only converges to higher rewards when learning movement primitives, but also that the learning is sped up by a factor of 4–40 times, depending on the task.

Freie Schlagworte: Reinforcement learning, policy search, learning from demonstrations, interactive machine learning, movement primitives, motor skills
Status: Verlagsversion
URN: urn:nbn:de:tuda-tuprints-169814
Sachgruppe der Dewey Dezimalklassifikatin (DDC): 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Intelligente Autonome Systeme
Hinterlegungsdatum: 21 Mai 2024 09:17
Letzte Änderung: 23 Mai 2024 13:47
PPN:
Export:
Suche nach Titel in: TUfind oder in Google

Verfügbare Versionen dieses Eintrags

Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen