Lutter, Michael (2021)
Inductive Biases in Machine Learning for Robotics and Control.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00020048
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
A fundamental problem of robotics is how can one program a robot to perform a task with its limited embodiment? Classical robotics solves this problem by carefully engineering interconnected modules. The main disadvantage is that this approach is labor-intensive and becomes close to impossible for unstructured environments and observations. Instead of manual engineering, one can solely use black-box models and data. In this paradigm, interconnected deep networks replace all modules of classical robotics. The network parameters are learned using reinforcement learning or self-supervised losses that predict the future.
In this thesis, we want to show that these two approaches of classical engineering and black-box deep networks are not mutually exclusive. One can transfer insights from classical robotics to the black box deep networks and obtain better learning algorithms for robotics and control. To show that incorporating existing knowledge as inductive biases in machine learning algorithms can improve performance, we present three different algorithms: (1) The Differentiable Newton Euler Algorithm (DiffNEA) reinterprets the classical system identification of rigid bodies. By leveraging automatic differentiation, virtual parameters, and gradient-based optimization, this approach guarantees physically consistent parameters and applies to a wider class of dynamical systems. (2) Deep Lagrangian Networks (DeLaN) combines deep networks with Lagrangian mechanics to learn dynamics models that conserve energy. Using two networks to represent the potential and kinetic energy enables the computation of a physically plausible dynamics model using the Euler-Lagrange equation. (3) Robust Fitted Value Iteration (rFVI) leverages the control-affine dynamics of mechanical systems to extend value iteration to the adversarial reinforcement learning with continuous actions. The resulting approach enables the computation of the optimal policy that is robust to changes in the dynamics.
Each of these algorithms is evaluated on physical systems and compared to the classical engineering and deep learning baselines. The experiments show that the inductive biases increase performance compared to black-box deep learning approaches. DiffNEA solves Ball-in-Cup on the physical Barrett WAM using offline model-based reinforcement learning and only four minutes of data. The deep networks models fail on this task despite using more data. DeLaN obtains a model that can be used for energy control of under-actuated systems. Black box models cannot be applied as these cannot infer the system energy. rFVI learns robust policies that can swing up the Furuta pendulum and cartpole. The rFVI policy is more robust to changes in the pendulum mass compared to deep reinforcement learning with uniform domain randomization.
In conclusion, this thesis introduces the combination of prior knowledge and deep learning. The presented algorithms highlight that one can use deep networks in more creative ways than naive input-output mappings for dynamics models and policies. Compared to the deep learning baselines, the proposed approaches can be applied to more problems and improve performance.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2021 | ||||
Autor(en): | Lutter, Michael | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Inductive Biases in Machine Learning for Robotics and Control | ||||
Sprache: | Englisch | ||||
Referenten: | Peters, Prof. Jan ; Tedrake, Prof. Russ | ||||
Publikationsjahr: | 2021 | ||||
Ort: | Darmstadt | ||||
Kollation: | xiii, 136 Seiten | ||||
Datum der mündlichen Prüfung: | 19 November 2021 | ||||
DOI: | 10.26083/tuprints-00020048 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/20048 | ||||
Kurzbeschreibung (Abstract): | A fundamental problem of robotics is how can one program a robot to perform a task with its limited embodiment? Classical robotics solves this problem by carefully engineering interconnected modules. The main disadvantage is that this approach is labor-intensive and becomes close to impossible for unstructured environments and observations. Instead of manual engineering, one can solely use black-box models and data. In this paradigm, interconnected deep networks replace all modules of classical robotics. The network parameters are learned using reinforcement learning or self-supervised losses that predict the future. In this thesis, we want to show that these two approaches of classical engineering and black-box deep networks are not mutually exclusive. One can transfer insights from classical robotics to the black box deep networks and obtain better learning algorithms for robotics and control. To show that incorporating existing knowledge as inductive biases in machine learning algorithms can improve performance, we present three different algorithms: (1) The Differentiable Newton Euler Algorithm (DiffNEA) reinterprets the classical system identification of rigid bodies. By leveraging automatic differentiation, virtual parameters, and gradient-based optimization, this approach guarantees physically consistent parameters and applies to a wider class of dynamical systems. (2) Deep Lagrangian Networks (DeLaN) combines deep networks with Lagrangian mechanics to learn dynamics models that conserve energy. Using two networks to represent the potential and kinetic energy enables the computation of a physically plausible dynamics model using the Euler-Lagrange equation. (3) Robust Fitted Value Iteration (rFVI) leverages the control-affine dynamics of mechanical systems to extend value iteration to the adversarial reinforcement learning with continuous actions. The resulting approach enables the computation of the optimal policy that is robust to changes in the dynamics. Each of these algorithms is evaluated on physical systems and compared to the classical engineering and deep learning baselines. The experiments show that the inductive biases increase performance compared to black-box deep learning approaches. DiffNEA solves Ball-in-Cup on the physical Barrett WAM using offline model-based reinforcement learning and only four minutes of data. The deep networks models fail on this task despite using more data. DeLaN obtains a model that can be used for energy control of under-actuated systems. Black box models cannot be applied as these cannot infer the system energy. rFVI learns robust policies that can swing up the Furuta pendulum and cartpole. The rFVI policy is more robust to changes in the pendulum mass compared to deep reinforcement learning with uniform domain randomization. In conclusion, this thesis introduces the combination of prior knowledge and deep learning. The presented algorithms highlight that one can use deep networks in more creative ways than naive input-output mappings for dynamics models and policies. Compared to the deep learning baselines, the proposed approaches can be applied to more problems and improve performance. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-200484 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik 600 Technik, Medizin, angewandte Wissenschaften > 620 Ingenieurwissenschaften und Maschinenbau |
||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Intelligente Autonome Systeme |
||||
Hinterlegungsdatum: | 03 Dez 2021 13:11 | ||||
Letzte Änderung: | 08 Dez 2021 07:54 | ||||
PPN: | |||||
Referenten: | Peters, Prof. Jan ; Tedrake, Prof. Russ | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 19 November 2021 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |