TU Darmstadt / ULB / TUbiblio

POMDPs in Continuous Time and Discrete Spaces

Alt, Bastian ; Schultheis, Matthias ; Koeppl, Heinz (2023)
POMDPs in Continuous Time and Discrete Spaces.
34th Conference on Neural Information Processing Systems. virtual Conference (06.-12.12.2020)
doi: 10.26083/tuprints-00023309
Conference or Workshop Item, Secondary publication, Publisher's Version

Abstract

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

Item Type: Conference or Workshop Item
Erschienen: 2023
Creators: Alt, Bastian ; Schultheis, Matthias ; Koeppl, Heinz
Type of entry: Secondary publication
Title: POMDPs in Continuous Time and Discrete Spaces
Language: English
Date: 2023
Place of Publication: Darmstadt
Year of primary publication: 2020
Publisher: Curran Associates, Inc.
Book Title: Advances in Neural Information Processing Systems
Series Volume: 33
Collation: 21 Seiten
Event Title: 34th Conference on Neural Information Processing Systems
Event Location: virtual Conference
Event Dates: 06.-12.12.2020
DOI: 10.26083/tuprints-00023309
URL / URN: https://tuprints.ulb.tu-darmstadt.de/23309
Corresponding Links:
Origin: Secondary publication service
Abstract:

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-233092
Classification DDC: 000 Generalities, computers, information > 004 Computer science
500 Science and mathematics > 510 Mathematics
Divisions: 18 Department of Electrical Engineering and Information Technology
18 Department of Electrical Engineering and Information Technology > Institute for Telecommunications > Bioinspired Communication Systems
18 Department of Electrical Engineering and Information Technology > Institute for Telecommunications
DFG-Collaborative Research Centres (incl. Transregio)
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres
Zentrale Einrichtungen
Zentrale Einrichtungen > Centre for Cognitive Science (CCS)
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > B: Adaptation Mechanisms
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > B: Adaptation Mechanisms > Subproject B4: Planning
Date Deposited: 31 Mar 2023 08:31
Last Modified: 04 Apr 2023 13:02
PPN:
Corresponding Links:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details