Besnard, Jean-Baptiste ; Tarraf, Ahmad ; Barthélemy, Clément ; Cascajo, Alberto ; Jeannot, Emmanuel ; Shende, Sameer S. ; Wolf, Felix (2023)
Towards Smarter Schedulers: Molding Jobs into the Right Shape via Monitoring and Modeling.
2nd International Workshop on Malleability Techniques Applications in High-Performance Computing (HPCMALL 2023). Hamburg, Germany (21.-25.05.2023)
doi: 10.1007/978-3-031-40843-4_6
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
High-performance computing is not only a race towards the fastest supercomputers but also the science of using such massive machines productively to acquire valuable results – outlining the importance of performance modelling and optimization. However, it appears that more than punctual optimization is required for current architectures, with users having to choose between multiple intertwined parallelism possibilities, dedicated accelerators, and I/O solutions. Witnessing this challenging context, our paper establishes an automatic feedback loop between how applications run and how they are launched, with a specific focus on I/O. One goal is to optimize how applications are launched through moldability (launch-time malleability). As a first step in this direction, we propose a new, always-on measurement infrastructure based on state-of-the-art cloud technologies adapted for HPC. In this paper, we present the measurement infrastructure and associated design choices. Moreover, we leverage an existing performance modelling tool to generate I/O performance models. We outline sample modelling capabilities, as derived from our measurement chain showing the critical importance of the measurement in future HPC systems, especially concerning resource configurations. Thanks to this precise performance model infrastructure, we can improve moldability and malleability on HPC systems.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2023 |
Autor(en): | Besnard, Jean-Baptiste ; Tarraf, Ahmad ; Barthélemy, Clément ; Cascajo, Alberto ; Jeannot, Emmanuel ; Shende, Sameer S. ; Wolf, Felix |
Art des Eintrags: | Bibliographie |
Titel: | Towards Smarter Schedulers: Molding Jobs into the Right Shape via Monitoring and Modeling |
Sprache: | Englisch |
Publikationsjahr: | 25 August 2023 |
Verlag: | Springer |
Buchtitel: | High Performance Computing: ISC High Performance 2023 International Workshops |
Band einer Reihe: | 13999 |
Veranstaltungstitel: | 2nd International Workshop on Malleability Techniques Applications in High-Performance Computing (HPCMALL 2023) |
Veranstaltungsort: | Hamburg, Germany |
Veranstaltungsdatum: | 21.-25.05.2023 |
DOI: | 10.1007/978-3-031-40843-4_6 |
Kurzbeschreibung (Abstract): | High-performance computing is not only a race towards the fastest supercomputers but also the science of using such massive machines productively to acquire valuable results – outlining the importance of performance modelling and optimization. However, it appears that more than punctual optimization is required for current architectures, with users having to choose between multiple intertwined parallelism possibilities, dedicated accelerators, and I/O solutions. Witnessing this challenging context, our paper establishes an automatic feedback loop between how applications run and how they are launched, with a specific focus on I/O. One goal is to optimize how applications are launched through moldability (launch-time malleability). As a first step in this direction, we propose a new, always-on measurement infrastructure based on state-of-the-art cloud technologies adapted for HPC. In this paper, we present the measurement infrastructure and associated design choices. Moreover, we leverage an existing performance modelling tool to generate I/O performance models. We outline sample modelling capabilities, as derived from our measurement chain showing the critical importance of the measurement in future HPC systems, especially concerning resource configurations. Thanks to this precise performance model infrastructure, we can improve moldability and malleability on HPC systems. |
Freie Schlagworte: | EU/BMBF|ADMIRE, Malleability, Moldability, Monitoring, performance modeling, EU, BMBF |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Parallele Programmierung |
Hinterlegungsdatum: | 04 Apr 2024 09:47 |
Letzte Änderung: | 09 Jul 2024 09:26 |
PPN: | 519669053 |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |