TU Darmstadt / ULB / TUbiblio

Trace-based Detection of Lock Contention in MPI One-Sided Communication

Hermanns, Marc-André ; Geimer, Markus ; Mohr, Bernd ; Wolf, Felix
Hrsg.: Niethammer, Christoph ; Gracia, José ; Hilbrich, Tobias ; Knüpfer, Andreas ; Resch, Michael ; Nagel, Wolfgang E. (2017)
Trace-based Detection of Lock Contention in MPI One-Sided Communication.
Proceedings of the 10th International Workshop on Parallel Tools for High Performance Computing. Stuttgart, Germany (04.10. - 05.10.2016)
doi: 10.1007/978-3-319-56702-0_6
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Performance analysis is an essential part of the development process of HPC applications. Thus, developers need adequate tools to evaluate design and implementation decisions to effectively develop efficient parallel applications. Therefore, it is crucial that tools provide an as complete support as possible for the available language and library features to ensure that design decisions are not negatively influenced by the level of available tool support. The message passing interface (MPI) supports three basic communication paradigms: point-to-point, collective, and one-sided. Each of these targets and excels at a specific application scenario. While current performance tools support the first two quite well, one-sided communication is often neglected. In our earlier work, we were able to reduce this gap by showing how wait states in MPI one-sided communication using active-target synchronization can be detected at large scale using our trace-based message replay technique. Further extending our work on the detection of progress-related wait states in ARMCI, this paper presents an improved infrastructure that is capable of not only detecting progress-related wait states, but also wait states due to lock contention in MPI passive-target synchronization. We present an event-based definition of lock contention, the trace-based algorithm to detect it, as well as initial results with a micro-benchmark and an application kernel scaling up to 65,536 processes.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2017
Herausgeber: Niethammer, Christoph ; Gracia, José ; Hilbrich, Tobias ; Knüpfer, Andreas ; Resch, Michael ; Nagel, Wolfgang E.
Autor(en): Hermanns, Marc-André ; Geimer, Markus ; Mohr, Bernd ; Wolf, Felix
Art des Eintrags: Bibliographie
Titel: Trace-based Detection of Lock Contention in MPI One-Sided Communication
Sprache: Englisch
Publikationsjahr: 9 Mai 2017
Verlag: Springer
Buchtitel: Tools for High Performance Computing 2016
Veranstaltungstitel: Proceedings of the 10th International Workshop on Parallel Tools for High Performance Computing
Veranstaltungsort: Stuttgart, Germany
Veranstaltungsdatum: 04.10. - 05.10.2016
Auflage: 1. Auflage
DOI: 10.1007/978-3-319-56702-0_6
Kurzbeschreibung (Abstract):

Performance analysis is an essential part of the development process of HPC applications. Thus, developers need adequate tools to evaluate design and implementation decisions to effectively develop efficient parallel applications. Therefore, it is crucial that tools provide an as complete support as possible for the available language and library features to ensure that design decisions are not negatively influenced by the level of available tool support. The message passing interface (MPI) supports three basic communication paradigms: point-to-point, collective, and one-sided. Each of these targets and excels at a specific application scenario. While current performance tools support the first two quite well, one-sided communication is often neglected. In our earlier work, we were able to reduce this gap by showing how wait states in MPI one-sided communication using active-target synchronization can be detected at large scale using our trace-based message replay technique. Further extending our work on the detection of progress-related wait states in ARMCI, this paper presents an improved infrastructure that is capable of not only detecting progress-related wait states, but also wait states due to lock contention in MPI passive-target synchronization. We present an event-based definition of lock contention, the trace-based algorithm to detect it, as well as initial results with a micro-benchmark and an application kernel scaling up to 65,536 processes.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Parallele Programmierung
Hinterlegungsdatum: 20 Apr 2018 12:22
Letzte Änderung: 25 Jun 2024 06:14
PPN: 519356454
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen