TU Darmstadt / ULB / TUbiblio

Discrete-Time Mean Field Control with Environment States

Cui, K. ; Tahir, A. ; Sinzger, M. ; Koeppl, H. (2021):
Discrete-Time Mean Field Control with Environment States.
60th Conference on Decision and Control (CDC2021), virtual Conference, 13.-15.12.2021, [Conference or Workshop Item]

Abstract

Multi-agent reinforcement learning methods have shown remarkable potential in solving complex multi-agent problems but mostly lack theoretical guarantees. Recently, mean field control and mean field games have been established as a tractable solution for large-scale multi-agent problems with many agents. In this work, driven by a motivating scheduling problem, we consider a discrete-time mean field control model with common environment states. We rigorously establish approximate optimality as the number of agents grows in the finite agent case and find that a dynamic programming principle holds, resulting in the existence of an optimal stationary policy. As exact solutions are difficult in general due to the resulting continuous action space of the limiting mean field Markov decision process, we apply established deep reinforcement learning methods to solve the associated mean field control problem. The performance of the learned mean field control policy is compared to typical multi-agent reinforcement learning approaches and is found to converge to the mean field performance for sufficiently many agents, verifying the obtained theoretical results and reaching competitive solutions

Item Type: Conference or Workshop Item
Erschienen: 2021
Creators: Cui, K. ; Tahir, A. ; Sinzger, M. ; Koeppl, H.
Title: Discrete-Time Mean Field Control with Environment States
Language: English
Abstract:

Multi-agent reinforcement learning methods have shown remarkable potential in solving complex multi-agent problems but mostly lack theoretical guarantees. Recently, mean field control and mean field games have been established as a tractable solution for large-scale multi-agent problems with many agents. In this work, driven by a motivating scheduling problem, we consider a discrete-time mean field control model with common environment states. We rigorously establish approximate optimality as the number of agents grows in the finite agent case and find that a dynamic programming principle holds, resulting in the existence of an optimal stationary policy. As exact solutions are difficult in general due to the resulting continuous action space of the limiting mean field Markov decision process, we apply established deep reinforcement learning methods to solve the associated mean field control problem. The performance of the learned mean field control policy is compared to typical multi-agent reinforcement learning approaches and is found to converge to the mean field performance for sufficiently many agents, verifying the obtained theoretical results and reaching competitive solutions

Uncontrolled Keywords: reinforcement learning, machine learning, MARL, emergenCITY_KOM
Divisions: 18 Department of Electrical Engineering and Information Technology
18 Department of Electrical Engineering and Information Technology > Institute for Telecommunications > Bioinspired Communication Systems
18 Department of Electrical Engineering and Information Technology > Institute for Telecommunications
DFG-Collaborative Research Centres (incl. Transregio)
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres
LOEWE
LOEWE > LOEWE-Zentren
LOEWE > LOEWE-Zentren > emergenCITY
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > C: Communication Mechanisms
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > C: Communication Mechanisms > Subproject C3: Content-centred perspective
Event Title: 60th Conference on Decision and Control (CDC2021)
Event Location: virtual Conference
Event Dates: 13.-15.12.2021
Date Deposited: 09 Aug 2021 07:05
Corresponding Links:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details