Look, Andreas (2023)
Deterministic Approximations for Deep State-Space Models.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00026352
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
This thesis focuses on neural network based modeling of stochastic dynamical systems with applications in the context of autonomous driving. We define three goals for the model that must be achieved with low computational cost due to the use of low-compute and energy-constrained chips in autonomous vehicles. First, our model must accurately capture the data uncertainty, which is also referred to as the aleatoric uncertainty. The data uncertainty cannot be reduced by collecting more data since we only have partial information. In essence, we are unable to observe all states, such as the driver's intention. To illustrate this, consider a vehicle approaching a junction with the choice of turning left or right. If the driver does not use an indicator, we cannot determine which direction he will follow. Second, the model must account for interactions between different traffic participants, as traffic is highly interactive. Modeling interactions between traffic participants is vital for accurate traffic forecasting, as the actions of one traffic participant can impact the actions of other traffic participants. For example, imagine a scenario where one vehicle is merging into the lane of another vehicle. Both vehicles need to interact and adjust their speed to accommodate the lane merging. Lastly, as it is impossible to include all traffic scenarios in the training data set, the model needs to account for model uncertainty that arises from the lack of knowledge, which is also known as epistemic uncertainty. Model uncertainty is especially important for traffic scenarios that have not been observed during training. Without accounting for model uncertainty, the model is limited to modeling the intrinsic data uncertainty.
Throughout this thesis, we introduce several advancements to Deep State-Space Models (DSSMs) that address the challenges of capturing intrinsic data uncertainty, modeling interactions, and incorporating model uncertainty, all while ensuring low computational cost. DSSMs extend state-space models towards neural transition and emission models. A DSSM describes a partially observable system where each emission is generated by a corresponding latent state. The dynamics of the latent states follow a Markovian structure, where the state at each time point is dependent solely on the previous time point's state. Due to the use of nonlinear neural networks in the transition and emission models, DSSMs offer high modeling capacity. Moreover, the stochasticity in the transition and emission models allows DSSMs to effectively capture the inherent data uncertainty.
After an introduction and reviewing relevant background material, we focus in the first part of the thesis on fully observed dynamical systems before transitioning to partially observed systems in the subsequent parts. Classical frameworks for simulating stochastic dynamical systems heavily rely on Monte Carlo sampling. As we demonstrate in this thesis, accurate prediction necessitates many particles, which induces a prohibitively high computational cost. To address this issue, we propose an alternative method that is computationally efficient and avoids the need for extensive Monte Carlo sampling. Our method relies on an assumed density approach to approximate the predictive distribution of the model. Specifically, we approximate the model's predictive distribution as a Gaussian at each time step. We estimate its moments by progressive moment matching horizontally in the time direction and vertically through neural network layers. Our proposed method is computationally more efficient than existing numerical integration schemes, as it exploits the layered structure of neural networks. This unimodal approximation lays the foundation for more complex approximations in the later parts. To assess the efficacy of our approach, we explore the application of our method in different domains.
In the second part of this thesis, we focus on partially observable systems and extend our framework towards deterministic uncertainty modeling with interacting agents, where each agent represents a vehicle in an autonomous driving setting. As a graph can capture the relations between different agents, we use a DSSM with graph neural networks in the transition model. Moreover, we extend our deterministic moment matching scheme to accommodate the multimodal nature of traffic forecasting. We demonstrate the applicability of our proposed framework on different autonomous driving datasets.
Finally, we address the challenge of incorporating model uncertainty into DSSMs, which is the uncertainty arising from the lack of knowledge. We achieve this by introducing uncertainty over the neural network weights in the transition model. However, accounting for both data and model uncertainty during inference is computationally expensive, as it requires marginalization over both sources of uncertainty. To address this pain point, we extend our deterministic approximation framework towards uncertainty propagation rules that account for both sources of uncertainty. We provide benchmarks on different domains that demonstrate the applicability of our model as a general-purpose tool.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2023 | ||||
Autor(en): | Look, Andreas | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Deterministic Approximations for Deep State-Space Models | ||||
Sprache: | Englisch | ||||
Referenten: | Peters, Prof. Jan ; Duvenaud, Prof. David ; Kandemir, Prof. Melih | ||||
Publikationsjahr: | 22 November 2023 | ||||
Ort: | Darmstadt | ||||
Kollation: | x, 131 Seiten | ||||
Datum der mündlichen Prüfung: | 23 Oktober 2023 | ||||
DOI: | 10.26083/tuprints-00026352 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/26352 | ||||
Kurzbeschreibung (Abstract): | This thesis focuses on neural network based modeling of stochastic dynamical systems with applications in the context of autonomous driving. We define three goals for the model that must be achieved with low computational cost due to the use of low-compute and energy-constrained chips in autonomous vehicles. First, our model must accurately capture the data uncertainty, which is also referred to as the aleatoric uncertainty. The data uncertainty cannot be reduced by collecting more data since we only have partial information. In essence, we are unable to observe all states, such as the driver's intention. To illustrate this, consider a vehicle approaching a junction with the choice of turning left or right. If the driver does not use an indicator, we cannot determine which direction he will follow. Second, the model must account for interactions between different traffic participants, as traffic is highly interactive. Modeling interactions between traffic participants is vital for accurate traffic forecasting, as the actions of one traffic participant can impact the actions of other traffic participants. For example, imagine a scenario where one vehicle is merging into the lane of another vehicle. Both vehicles need to interact and adjust their speed to accommodate the lane merging. Lastly, as it is impossible to include all traffic scenarios in the training data set, the model needs to account for model uncertainty that arises from the lack of knowledge, which is also known as epistemic uncertainty. Model uncertainty is especially important for traffic scenarios that have not been observed during training. Without accounting for model uncertainty, the model is limited to modeling the intrinsic data uncertainty. Throughout this thesis, we introduce several advancements to Deep State-Space Models (DSSMs) that address the challenges of capturing intrinsic data uncertainty, modeling interactions, and incorporating model uncertainty, all while ensuring low computational cost. DSSMs extend state-space models towards neural transition and emission models. A DSSM describes a partially observable system where each emission is generated by a corresponding latent state. The dynamics of the latent states follow a Markovian structure, where the state at each time point is dependent solely on the previous time point's state. Due to the use of nonlinear neural networks in the transition and emission models, DSSMs offer high modeling capacity. Moreover, the stochasticity in the transition and emission models allows DSSMs to effectively capture the inherent data uncertainty. After an introduction and reviewing relevant background material, we focus in the first part of the thesis on fully observed dynamical systems before transitioning to partially observed systems in the subsequent parts. Classical frameworks for simulating stochastic dynamical systems heavily rely on Monte Carlo sampling. As we demonstrate in this thesis, accurate prediction necessitates many particles, which induces a prohibitively high computational cost. To address this issue, we propose an alternative method that is computationally efficient and avoids the need for extensive Monte Carlo sampling. Our method relies on an assumed density approach to approximate the predictive distribution of the model. Specifically, we approximate the model's predictive distribution as a Gaussian at each time step. We estimate its moments by progressive moment matching horizontally in the time direction and vertically through neural network layers. Our proposed method is computationally more efficient than existing numerical integration schemes, as it exploits the layered structure of neural networks. This unimodal approximation lays the foundation for more complex approximations in the later parts. To assess the efficacy of our approach, we explore the application of our method in different domains. In the second part of this thesis, we focus on partially observable systems and extend our framework towards deterministic uncertainty modeling with interacting agents, where each agent represents a vehicle in an autonomous driving setting. As a graph can capture the relations between different agents, we use a DSSM with graph neural networks in the transition model. Moreover, we extend our deterministic moment matching scheme to accommodate the multimodal nature of traffic forecasting. We demonstrate the applicability of our proposed framework on different autonomous driving datasets. Finally, we address the challenge of incorporating model uncertainty into DSSMs, which is the uncertainty arising from the lack of knowledge. We achieve this by introducing uncertainty over the neural network weights in the transition model. However, accounting for both data and model uncertainty during inference is computationally expensive, as it requires marginalization over both sources of uncertainty. To address this pain point, we extend our deterministic approximation framework towards uncertainty propagation rules that account for both sources of uncertainty. We provide benchmarks on different domains that demonstrate the applicability of our model as a general-purpose tool. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-263529 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik 600 Technik, Medizin, angewandte Wissenschaften > 600 Technik |
||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Intelligente Autonome Systeme |
||||
Hinterlegungsdatum: | 22 Nov 2023 13:03 | ||||
Letzte Änderung: | 27 Nov 2023 10:28 | ||||
PPN: | |||||
Referenten: | Peters, Prof. Jan ; Duvenaud, Prof. David ; Kandemir, Prof. Melih | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 23 Oktober 2023 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |