Cui, Kai (2024)
Large-Scale Multi-Agent Reinforcement Learning via Mean Field Games.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00028568
Ph.D. Thesis, Primary publication, Publisher's Version
Abstract
In this dissertation, we discuss the mathematically rigorous multi-agent reinforcement learning frameworks of mean field games (MFG) and mean field control (MFC). Dynamical multi-agent control problems and their game-theoretic counterparts find many applications in practice, but can be difficult to scale to many agents. MFGs and MFC allow the tractable modeling of large-scale dynamical multi-agent control and game problems. In essence, the idea is to reduce interaction between infinitely many homogeneous agents to their anonymous distribution – the so-called mean field. This reduces many practical problems to considering a single representative agent and – by the law of large numbers – its probability law. In this thesis, we present various novel learning algorithms and theoretical frameworks of MFGs and MFC. We address existing algorithmic limitations, and also extend MFGs and MFC beyond their restriction to (i) weakly-interacting agents, (ii) all-knowing and rational agents, or (iii) homogeneity of agents. Lastly, some practical applications are briefly considered to demonstrate the usefulness of our developed algorithms.
Firstly, we consider the competitive case of MFGs. There, we show that in the simplest case of finite MFGs, existing algorithms are strongly limited in their generality. In particular, the common assumption of contractive fixed-point operators is shown to be difficult to fulfill. We then contribute and analyze approximate learning algorithms for MFGs based on regularization, which allows for a trade-off between approximation and tractability. We then proceed to extend results to MFGs on graphs and hypergraphs, in order to increase the descriptiveness of MFGs and ameliorate the restriction of homogeneity. Lastly, we also extend towards the presence of both strongly interacting and many weakly-interacting agents, in order to obtain tractability for cases where some agents do not fall under the mean field approximation.
Secondly, we investigate cooperative MFC. Initially, we consider an extension to environmental states under a simplifying assumption of static mean fields. Approximate optimality of an MFC solution is shown over any finite agent solution. More generally, we proceed to extend MFC to strongly interacting agents, similar to the MFG scenario. Our final extension considers partial observability, where decentralized agents act only upon available information. Here, a framework optimizing over Lipschitz classes of policies is introduced. We obtain policy gradient approximation guarantees for the latter two settings. The frameworks are verified theoretically by showing approximate optimality of MFC, and experimentally by demonstrating performance comparable or superior to state-of-the-art multi-agent reinforcement learning algorithms.
Finally, we briefly explore some potential applications of MFGs and MFC in scenarios with large populations of agents. We survey applications in distributed computing, cyber-physical systems, autonomous mobility and routing, as well as natural and social sciences. We also take a closer look at two particular applications in UAV swarm control and edge computing. In the former, we consider the effect of collision avoidance as an additional constraint for MFC in embodied robot swarms. In the latter, we compare MFG and MFC results for a computational offloading scenario.
Overall, in this thesis we investigate the suitability of methods based on MFC and MFC for large-scale tractable multi-agent reinforcement learning. We contribute novel learning methods and theoretical approximation frameworks, as well as study some applications. On the whole, we find that MFGs and MFC can successfully be applied to analyze large-scale control and games, with high generality and outperforming some state-of-the-art solutions.
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Erschienen: | 2024 | ||||
Creators: | Cui, Kai | ||||
Type of entry: | Primary publication | ||||
Title: | Large-Scale Multi-Agent Reinforcement Learning via Mean Field Games | ||||
Language: | English | ||||
Referees: | Koeppl, Prof. Dr. Heinz ; Laurière, Prof. Dr. Mathieu | ||||
Date: | 24 October 2024 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | xvii, 327 Seiten | ||||
Refereed: | 17 October 2024 | ||||
DOI: | 10.26083/tuprints-00028568 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/28568 | ||||
Abstract: | In this dissertation, we discuss the mathematically rigorous multi-agent reinforcement learning frameworks of mean field games (MFG) and mean field control (MFC). Dynamical multi-agent control problems and their game-theoretic counterparts find many applications in practice, but can be difficult to scale to many agents. MFGs and MFC allow the tractable modeling of large-scale dynamical multi-agent control and game problems. In essence, the idea is to reduce interaction between infinitely many homogeneous agents to their anonymous distribution – the so-called mean field. This reduces many practical problems to considering a single representative agent and – by the law of large numbers – its probability law. In this thesis, we present various novel learning algorithms and theoretical frameworks of MFGs and MFC. We address existing algorithmic limitations, and also extend MFGs and MFC beyond their restriction to (i) weakly-interacting agents, (ii) all-knowing and rational agents, or (iii) homogeneity of agents. Lastly, some practical applications are briefly considered to demonstrate the usefulness of our developed algorithms. Firstly, we consider the competitive case of MFGs. There, we show that in the simplest case of finite MFGs, existing algorithms are strongly limited in their generality. In particular, the common assumption of contractive fixed-point operators is shown to be difficult to fulfill. We then contribute and analyze approximate learning algorithms for MFGs based on regularization, which allows for a trade-off between approximation and tractability. We then proceed to extend results to MFGs on graphs and hypergraphs, in order to increase the descriptiveness of MFGs and ameliorate the restriction of homogeneity. Lastly, we also extend towards the presence of both strongly interacting and many weakly-interacting agents, in order to obtain tractability for cases where some agents do not fall under the mean field approximation. Secondly, we investigate cooperative MFC. Initially, we consider an extension to environmental states under a simplifying assumption of static mean fields. Approximate optimality of an MFC solution is shown over any finite agent solution. More generally, we proceed to extend MFC to strongly interacting agents, similar to the MFG scenario. Our final extension considers partial observability, where decentralized agents act only upon available information. Here, a framework optimizing over Lipschitz classes of policies is introduced. We obtain policy gradient approximation guarantees for the latter two settings. The frameworks are verified theoretically by showing approximate optimality of MFC, and experimentally by demonstrating performance comparable or superior to state-of-the-art multi-agent reinforcement learning algorithms. Finally, we briefly explore some potential applications of MFGs and MFC in scenarios with large populations of agents. We survey applications in distributed computing, cyber-physical systems, autonomous mobility and routing, as well as natural and social sciences. We also take a closer look at two particular applications in UAV swarm control and edge computing. In the former, we consider the effect of collision avoidance as an additional constraint for MFC in embodied robot swarms. In the latter, we compare MFG and MFC results for a computational offloading scenario. Overall, in this thesis we investigate the suitability of methods based on MFC and MFC for large-scale tractable multi-agent reinforcement learning. We contribute novel learning methods and theoretical approximation frameworks, as well as study some applications. On the whole, we find that MFGs and MFC can successfully be applied to analyze large-scale control and games, with high generality and outperforming some state-of-the-art solutions. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-285682 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science 600 Technology, medicine, applied sciences > 620 Engineering and machine engineering 600 Technology, medicine, applied sciences > 621.3 Electrical engineering, electronics |
||||
Divisions: | 18 Department of Electrical Engineering and Information Technology 18 Department of Electrical Engineering and Information Technology > Self-Organizing Systems Lab LOEWE LOEWE > LOEWE-Zentren LOEWE > LOEWE-Zentren > emergenCITY |
||||
Date Deposited: | 24 Oct 2024 12:13 | ||||
Last Modified: | 25 Oct 2024 12:44 | ||||
PPN: | |||||
Referees: | Koeppl, Prof. Dr. Heinz ; Laurière, Prof. Dr. Mathieu | ||||
Refereed / Verteidigung / mdl. Prüfung: | 17 October 2024 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Send an inquiry |
Options (only for editors)
Show editorial Details |