TU Darmstadt / ULB / TUbiblio

POMDPs in Continuous Time and Discrete Spaces

Alt, Bastian ; Schultheis, Matthias ; Koeppl, Heinz (2023)
POMDPs in Continuous Time and Discrete Spaces.
34th Conference on Neural Information Processing Systems. virtual Conference (06.-12.12.2020)
doi: 10.26083/tuprints-00023309
Konferenzveröffentlichung, Zweitveröffentlichung, Verlagsversion

Kurzbeschreibung (Abstract)

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2023
Autor(en): Alt, Bastian ; Schultheis, Matthias ; Koeppl, Heinz
Art des Eintrags: Zweitveröffentlichung
Titel: POMDPs in Continuous Time and Discrete Spaces
Sprache: Englisch
Publikationsjahr: 2023
Ort: Darmstadt
Publikationsdatum der Erstveröffentlichung: 2020
Verlag: Curran Associates, Inc.
Buchtitel: Advances in Neural Information Processing Systems
Band einer Reihe: 33
Kollation: 21 Seiten
Veranstaltungstitel: 34th Conference on Neural Information Processing Systems
Veranstaltungsort: virtual Conference
Veranstaltungsdatum: 06.-12.12.2020
DOI: 10.26083/tuprints-00023309
URL / URN: https://tuprints.ulb.tu-darmstadt.de/23309
Zugehörige Links:
Herkunft: Zweitveröffentlichungsservice
Kurzbeschreibung (Abstract):

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

Status: Verlagsversion
URN: urn:nbn:de:tuda-tuprints-233092
Sachgruppe der Dewey Dezimalklassifikatin (DDC): 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik
500 Naturwissenschaften und Mathematik > 510 Mathematik
Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Bioinspirierte Kommunikationssysteme
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik
DFG-Sonderforschungsbereiche (inkl. Transregio)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche
Zentrale Einrichtungen
Zentrale Einrichtungen > Centre for Cognitive Science (CCS)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > B: Adaptionsmechanismen
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > B: Adaptionsmechanismen > Teilprojekt B4: Planung
Hinterlegungsdatum: 31 Mär 2023 08:31
Letzte Änderung: 04 Apr 2023 13:02
PPN:
Zugehörige Links:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen