Cui, Kai (2024)
Large-Scale Multi-Agent Reinforcement Learning via Mean Field Games.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00028568
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
In this dissertation, we discuss the mathematically rigorous multi-agent reinforcement learning frameworks of mean field games (MFG) and mean field control (MFC). Dynamical multi-agent control problems and their game-theoretic counterparts find many applications in practice, but can be difficult to scale to many agents. MFGs and MFC allow the tractable modeling of large-scale dynamical multi-agent control and game problems. In essence, the idea is to reduce interaction between infinitely many homogeneous agents to their anonymous distribution – the so-called mean field. This reduces many practical problems to considering a single representative agent and – by the law of large numbers – its probability law. In this thesis, we present various novel learning algorithms and theoretical frameworks of MFGs and MFC. We address existing algorithmic limitations, and also extend MFGs and MFC beyond their restriction to (i) weakly-interacting agents, (ii) all-knowing and rational agents, or (iii) homogeneity of agents. Lastly, some practical applications are briefly considered to demonstrate the usefulness of our developed algorithms.
Firstly, we consider the competitive case of MFGs. There, we show that in the simplest case of finite MFGs, existing algorithms are strongly limited in their generality. In particular, the common assumption of contractive fixed-point operators is shown to be difficult to fulfill. We then contribute and analyze approximate learning algorithms for MFGs based on regularization, which allows for a trade-off between approximation and tractability. We then proceed to extend results to MFGs on graphs and hypergraphs, in order to increase the descriptiveness of MFGs and ameliorate the restriction of homogeneity. Lastly, we also extend towards the presence of both strongly interacting and many weakly-interacting agents, in order to obtain tractability for cases where some agents do not fall under the mean field approximation.
Secondly, we investigate cooperative MFC. Initially, we consider an extension to environmental states under a simplifying assumption of static mean fields. Approximate optimality of an MFC solution is shown over any finite agent solution. More generally, we proceed to extend MFC to strongly interacting agents, similar to the MFG scenario. Our final extension considers partial observability, where decentralized agents act only upon available information. Here, a framework optimizing over Lipschitz classes of policies is introduced. We obtain policy gradient approximation guarantees for the latter two settings. The frameworks are verified theoretically by showing approximate optimality of MFC, and experimentally by demonstrating performance comparable or superior to state-of-the-art multi-agent reinforcement learning algorithms.
Finally, we briefly explore some potential applications of MFGs and MFC in scenarios with large populations of agents. We survey applications in distributed computing, cyber-physical systems, autonomous mobility and routing, as well as natural and social sciences. We also take a closer look at two particular applications in UAV swarm control and edge computing. In the former, we consider the effect of collision avoidance as an additional constraint for MFC in embodied robot swarms. In the latter, we compare MFG and MFC results for a computational offloading scenario.
Overall, in this thesis we investigate the suitability of methods based on MFC and MFC for large-scale tractable multi-agent reinforcement learning. We contribute novel learning methods and theoretical approximation frameworks, as well as study some applications. On the whole, we find that MFGs and MFC can successfully be applied to analyze large-scale control and games, with high generality and outperforming some state-of-the-art solutions.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2024 | ||||
Autor(en): | Cui, Kai | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Large-Scale Multi-Agent Reinforcement Learning via Mean Field Games | ||||
Sprache: | Englisch | ||||
Referenten: | Koeppl, Prof. Dr. Heinz ; Laurière, Prof. Dr. Mathieu | ||||
Publikationsjahr: | 24 Oktober 2024 | ||||
Ort: | Darmstadt | ||||
Kollation: | xvii, 327 Seiten | ||||
Datum der mündlichen Prüfung: | 17 Oktober 2024 | ||||
DOI: | 10.26083/tuprints-00028568 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/28568 | ||||
Kurzbeschreibung (Abstract): | In this dissertation, we discuss the mathematically rigorous multi-agent reinforcement learning frameworks of mean field games (MFG) and mean field control (MFC). Dynamical multi-agent control problems and their game-theoretic counterparts find many applications in practice, but can be difficult to scale to many agents. MFGs and MFC allow the tractable modeling of large-scale dynamical multi-agent control and game problems. In essence, the idea is to reduce interaction between infinitely many homogeneous agents to their anonymous distribution – the so-called mean field. This reduces many practical problems to considering a single representative agent and – by the law of large numbers – its probability law. In this thesis, we present various novel learning algorithms and theoretical frameworks of MFGs and MFC. We address existing algorithmic limitations, and also extend MFGs and MFC beyond their restriction to (i) weakly-interacting agents, (ii) all-knowing and rational agents, or (iii) homogeneity of agents. Lastly, some practical applications are briefly considered to demonstrate the usefulness of our developed algorithms. Firstly, we consider the competitive case of MFGs. There, we show that in the simplest case of finite MFGs, existing algorithms are strongly limited in their generality. In particular, the common assumption of contractive fixed-point operators is shown to be difficult to fulfill. We then contribute and analyze approximate learning algorithms for MFGs based on regularization, which allows for a trade-off between approximation and tractability. We then proceed to extend results to MFGs on graphs and hypergraphs, in order to increase the descriptiveness of MFGs and ameliorate the restriction of homogeneity. Lastly, we also extend towards the presence of both strongly interacting and many weakly-interacting agents, in order to obtain tractability for cases where some agents do not fall under the mean field approximation. Secondly, we investigate cooperative MFC. Initially, we consider an extension to environmental states under a simplifying assumption of static mean fields. Approximate optimality of an MFC solution is shown over any finite agent solution. More generally, we proceed to extend MFC to strongly interacting agents, similar to the MFG scenario. Our final extension considers partial observability, where decentralized agents act only upon available information. Here, a framework optimizing over Lipschitz classes of policies is introduced. We obtain policy gradient approximation guarantees for the latter two settings. The frameworks are verified theoretically by showing approximate optimality of MFC, and experimentally by demonstrating performance comparable or superior to state-of-the-art multi-agent reinforcement learning algorithms. Finally, we briefly explore some potential applications of MFGs and MFC in scenarios with large populations of agents. We survey applications in distributed computing, cyber-physical systems, autonomous mobility and routing, as well as natural and social sciences. We also take a closer look at two particular applications in UAV swarm control and edge computing. In the former, we consider the effect of collision avoidance as an additional constraint for MFC in embodied robot swarms. In the latter, we compare MFG and MFC results for a computational offloading scenario. Overall, in this thesis we investigate the suitability of methods based on MFC and MFC for large-scale tractable multi-agent reinforcement learning. We contribute novel learning methods and theoretical approximation frameworks, as well as study some applications. On the whole, we find that MFGs and MFC can successfully be applied to analyze large-scale control and games, with high generality and outperforming some state-of-the-art solutions. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-285682 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik 600 Technik, Medizin, angewandte Wissenschaften > 620 Ingenieurwissenschaften und Maschinenbau 600 Technik, Medizin, angewandte Wissenschaften > 621.3 Elektrotechnik, Elektronik |
||||
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Self-Organizing Systems Lab LOEWE LOEWE > LOEWE-Zentren LOEWE > LOEWE-Zentren > emergenCITY |
||||
Hinterlegungsdatum: | 24 Okt 2024 12:13 | ||||
Letzte Änderung: | 25 Okt 2024 12:44 | ||||
PPN: | |||||
Referenten: | Koeppl, Prof. Dr. Heinz ; Laurière, Prof. Dr. Mathieu | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 17 Oktober 2024 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |