Alt, B. ; Schultheis, M. ; Koeppl, H. (2020)
POMDPs in Continuous Time and Discrete Spaces.
34th Conference on Neural Information Processing Systems. virtual Conference (06.12.2020-12.12.2020)
Konferenzveröffentlichung, Bibliographie
Dies ist die neueste Version dieses Eintrags.
Kurzbeschreibung (Abstract)
Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2020 |
Autor(en): | Alt, B. ; Schultheis, M. ; Koeppl, H. |
Art des Eintrags: | Bibliographie |
Titel: | POMDPs in Continuous Time and Discrete Spaces |
Sprache: | Englisch |
Publikationsjahr: | 22 Oktober 2020 |
Buchtitel: | Advances in Neural Information Processing Systems 33 (NeurIPS 2020) |
Veranstaltungstitel: | 34th Conference on Neural Information Processing Systems |
Veranstaltungsort: | virtual Conference |
Veranstaltungsdatum: | 06.12.2020-12.12.2020 |
URL / URN: | https://proceedings.neurips.cc/paper/2020/hash/992f0fed0720d... |
Zugehörige Links: | |
Kurzbeschreibung (Abstract): | Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems. |
Zusätzliche Informationen: | Erstveröffentlichung |
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Bioinspirierte Kommunikationssysteme 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik DFG-Sonderforschungsbereiche (inkl. Transregio) DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche Zentrale Einrichtungen Zentrale Einrichtungen > Centre for Cognitive Science (CCS) DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > B: Adaptionsmechanismen DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > B: Adaptionsmechanismen > Teilprojekt B4: Planung |
Hinterlegungsdatum: | 26 Okt 2020 12:06 |
Letzte Änderung: | 03 Jul 2024 02:47 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Verfügbare Versionen dieses Eintrags
-
POMDPs in Continuous Time and Discrete Spaces. (deposited 31 Mär 2023 08:31)
- POMDPs in Continuous Time and Discrete Spaces. (deposited 26 Okt 2020 12:06) [Gegenwärtig angezeigt]
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |