TU Darmstadt / ULB / TUbiblio

Learning control policies from optimal trajectories

Zelch, Christoph ; Peters, Jan ; Stryk, Oskar von (2020)
Learning control policies from optimal trajectories.
2020 IEEE International Conference on Robotics and Automation. Paris, France (31.08.2020-31.08.2020)
doi: 10.1109/ICRA40945.2020.9196791
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

The ability to optimally control robotic systems offers significant advantages for their performance. While time-dependent optimal trajectories can numerically be computed for high dimensional nonlinear system dynamic models, constraints and objectives, finding optimal feedback control policies for such systems is hard. This is unfortunate, as without a policy, the control of real-world systems requires frequent correction or replanning to compensate for disturbances and model errors. In this paper, a feedback control policy is learned from a set of optimal reference trajectories using Gaussian processes. Information from existing trajectories and the current policy is used to find promising start points for the computation of further optimal trajectories. This aspect is important as it avoids exhaustive sampling of the complete state space, which is impractical due to the high dimensional state space, and to focus on the relevant region. The presented method has been applied in simulation to a swing-up problem of an underactuated pendulum and a energy-minimal point-to-point movement of a 3-DOF industrial robot.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2020
Autor(en): Zelch, Christoph ; Peters, Jan ; Stryk, Oskar von
Art des Eintrags: Bibliographie
Titel: Learning control policies from optimal trajectories
Sprache: Englisch
Publikationsjahr: 15 September 2020
Verlag: IEEE
Buchtitel: 2020 IEEE International Conference on Robotics and Automation (ICRA 2020)
Veranstaltungstitel: 2020 IEEE International Conference on Robotics and Automation
Veranstaltungsort: Paris, France
Veranstaltungsdatum: 31.08.2020-31.08.2020
DOI: 10.1109/ICRA40945.2020.9196791
Zugehörige Links:
Kurzbeschreibung (Abstract):

The ability to optimally control robotic systems offers significant advantages for their performance. While time-dependent optimal trajectories can numerically be computed for high dimensional nonlinear system dynamic models, constraints and objectives, finding optimal feedback control policies for such systems is hard. This is unfortunate, as without a policy, the control of real-world systems requires frequent correction or replanning to compensate for disturbances and model errors. In this paper, a feedback control policy is learned from a set of optimal reference trajectories using Gaussian processes. Information from existing trajectories and the current policy is used to find promising start points for the computation of further optimal trajectories. This aspect is important as it avoids exhaustive sampling of the complete state space, which is impractical due to the high dimensional state space, and to focus on the relevant region. The presented method has been applied in simulation to a swing-up problem of an underactuated pendulum and a energy-minimal point-to-point movement of a 3-DOF industrial robot.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Intelligente Autonome Systeme
20 Fachbereich Informatik > Simulation, Systemoptimierung und Robotik
Hinterlegungsdatum: 30 Nov 2021 14:17
Letzte Änderung: 08 Dez 2023 10:14
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen