Koc, Okan (2018)
Optimal Trajectory Generation and Learning Control for Robot Table Tennis.
Technische Universität Darmstadt
Dissertation, Erstveröffentlichung
Kurzbeschreibung (Abstract)
As robots become more capable in terms of hardware, and more complex tasks are considered, optimality starts playing a more important role in the design of algorithms implemented in these systems. Optimality is a guiding principle that directs the computation of feasible and efficient solutions to different robotics tasks. In control theory, this principle is implemented online as a set of efficient numerical optimization algorithms, that in addition to solving the task, purports to save a suitably defined effort or energy term. This thesis investigates trajectory generation, learning and control for dynamic tasks from the unifying point of view of optimization. As an application, we focus on Table Tennis, a chal- lenging task where robots are yet to outperform humans. We believe that the required dexterity and accuracy for this dynamical task hinges on the developments in online optimization and efficient learning algorithms. We consider trajectory generation for table tennis in the first part of the thesis. In highly dynamic tasks like table tennis that involve moving targets, planning is necessary to figure out when, where and how to intercept the target. Motion planning can be very challenging in robotic table tennis in particular, due to time constraints, dimension of the search space and joint limits. Conventional planning algorithms often rely on a fixed virtual hitting plane to construct robot striking trajectories. These algorithms, however, generate restrictive strokes and can result in unnatural strategies when compared with human playing. In this thesis, we introduce a new trajectory generation framework for robotic table tennis that does not involve a fixed hitting plane. A free-time optimal control approach is used to derive two different trajectory optimizers. The resulting two algorithms, Focused Player and Defensive Player, encode two different play- styles. We evaluate their performance in simulation and in our robot table tennis platform with a high speed cable-driven seven DOF robot arm. The algorithms return the balls with a higher probability to the opponent’s court when compared with a virtual hitting plane based method. Moreover, both can be run online and the trajectories can be corrected with new ball observations. In the second part of the thesis, we look at how such trajectories, computed on the kine- matics level, can be tracked accurately with learning control based approaches. Highly dynamic tasks like table tennis require large accelerations and precise tracking for successful perfor- mance. To track desired trajectories well, such tasks usually rely on accurate models and/or high gain feedback. While kinematic optimization allows for efficient representation and online generation of hitting trajectories, learning to track such dynamic movements with inaccurate models remains an open problem. In particular, stability issues surrounding the learning per- formance, in the iteration domain, can prevent the successful implementation of model based learning approaches. To achieve accurate tracking for these tasks in a stable and efficient way, we propose a new adaptive Iterative Learning Control algorithm that is implemented efficiently using a recursive approach. Moreover, covariance estimates of model matrices are used to ex- ercise caution during learning. We evaluate the performance of the proposed approach in our robotic table tennis platform, where we show how the performance of two Barrett WAMs can be optimized. Our implementation on the table tennis platform compares favorably with two state-of-the-art approaches. Finally, we discuss an alternative learning from demonstrations approach, where we learn sparse representations from demonstrated movements. Learning from demonstrations is an easy and intuitive way to show examples of successful behavior to a robot. However, the fact that humans optimize or take advantage of their body and not of the robot, usually called the embodiment problem in robotics, often prevents industrial robots from executing the task in a straightforward way. The shown movements often do not or cannot utilize the degrees of freedom of the robot efficiently, and typically suffer from excessive execution errors. In the last chapter, we show a new approach that can alleviate some of these difficulties by learning sparse representations of movement. Moreover, the number of learned parameters are independent of the degrees of freedom of the robot. Sparsity is a desirable feature for policy search Rein- forcement Learning algorithms that adapt the parameters of these movement primitives. By ranking the learned parameters on the Elastic Net path in terms of importance, we note that our approach could be potentially useful to combat the curse of dimensionality in robot learning applications. We show preliminary results on the real robot setup, including a successful table tennis serve using our new movement primitive representation. Throughout the thesis, we present and analyze in detail new control and learning algo- rithms. Efficient online optimization approaches are presented that can be used to solve not just table tennis problems, but they can be adapted to solve different dynamic tasks.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2018 | ||||
Autor(en): | Koc, Okan | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Optimal Trajectory Generation and Learning Control for Robot Table Tennis | ||||
Sprache: | Englisch | ||||
Referenten: | Peters, Prof. Dr. Jan ; Vijayakumar, Prof. Dr. Sethu | ||||
Publikationsjahr: | 24 Oktober 2018 | ||||
Ort: | Darmstadt | ||||
Datum der mündlichen Prüfung: | 24 Oktober 2018 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/8948 | ||||
Kurzbeschreibung (Abstract): | As robots become more capable in terms of hardware, and more complex tasks are considered, optimality starts playing a more important role in the design of algorithms implemented in these systems. Optimality is a guiding principle that directs the computation of feasible and efficient solutions to different robotics tasks. In control theory, this principle is implemented online as a set of efficient numerical optimization algorithms, that in addition to solving the task, purports to save a suitably defined effort or energy term. This thesis investigates trajectory generation, learning and control for dynamic tasks from the unifying point of view of optimization. As an application, we focus on Table Tennis, a chal- lenging task where robots are yet to outperform humans. We believe that the required dexterity and accuracy for this dynamical task hinges on the developments in online optimization and efficient learning algorithms. We consider trajectory generation for table tennis in the first part of the thesis. In highly dynamic tasks like table tennis that involve moving targets, planning is necessary to figure out when, where and how to intercept the target. Motion planning can be very challenging in robotic table tennis in particular, due to time constraints, dimension of the search space and joint limits. Conventional planning algorithms often rely on a fixed virtual hitting plane to construct robot striking trajectories. These algorithms, however, generate restrictive strokes and can result in unnatural strategies when compared with human playing. In this thesis, we introduce a new trajectory generation framework for robotic table tennis that does not involve a fixed hitting plane. A free-time optimal control approach is used to derive two different trajectory optimizers. The resulting two algorithms, Focused Player and Defensive Player, encode two different play- styles. We evaluate their performance in simulation and in our robot table tennis platform with a high speed cable-driven seven DOF robot arm. The algorithms return the balls with a higher probability to the opponent’s court when compared with a virtual hitting plane based method. Moreover, both can be run online and the trajectories can be corrected with new ball observations. In the second part of the thesis, we look at how such trajectories, computed on the kine- matics level, can be tracked accurately with learning control based approaches. Highly dynamic tasks like table tennis require large accelerations and precise tracking for successful perfor- mance. To track desired trajectories well, such tasks usually rely on accurate models and/or high gain feedback. While kinematic optimization allows for efficient representation and online generation of hitting trajectories, learning to track such dynamic movements with inaccurate models remains an open problem. In particular, stability issues surrounding the learning per- formance, in the iteration domain, can prevent the successful implementation of model based learning approaches. To achieve accurate tracking for these tasks in a stable and efficient way, we propose a new adaptive Iterative Learning Control algorithm that is implemented efficiently using a recursive approach. Moreover, covariance estimates of model matrices are used to ex- ercise caution during learning. We evaluate the performance of the proposed approach in our robotic table tennis platform, where we show how the performance of two Barrett WAMs can be optimized. Our implementation on the table tennis platform compares favorably with two state-of-the-art approaches. Finally, we discuss an alternative learning from demonstrations approach, where we learn sparse representations from demonstrated movements. Learning from demonstrations is an easy and intuitive way to show examples of successful behavior to a robot. However, the fact that humans optimize or take advantage of their body and not of the robot, usually called the embodiment problem in robotics, often prevents industrial robots from executing the task in a straightforward way. The shown movements often do not or cannot utilize the degrees of freedom of the robot efficiently, and typically suffer from excessive execution errors. In the last chapter, we show a new approach that can alleviate some of these difficulties by learning sparse representations of movement. Moreover, the number of learned parameters are independent of the degrees of freedom of the robot. Sparsity is a desirable feature for policy search Rein- forcement Learning algorithms that adapt the parameters of these movement primitives. By ranking the learned parameters on the Elastic Net path in terms of importance, we note that our approach could be potentially useful to combat the curse of dimensionality in robot learning applications. We show preliminary results on the real robot setup, including a successful table tennis serve using our new movement primitive representation. Throughout the thesis, we present and analyze in detail new control and learning algo- rithms. Efficient online optimization approaches are presented that can be used to solve not just table tennis problems, but they can be adapted to solve different dynamic tasks. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
URN: | urn:nbn:de:tuda-tuprints-89486 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik | ||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Intelligente Autonome Systeme |
||||
Hinterlegungsdatum: | 13 Okt 2019 19:55 | ||||
Letzte Änderung: | 13 Okt 2019 19:55 | ||||
PPN: | |||||
Referenten: | Peters, Prof. Dr. Jan ; Vijayakumar, Prof. Dr. Sethu | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 24 Oktober 2018 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |