TU Darmstadt / ULB / TUbiblio

An Approach to Visualize Remote Socket Traffic on the Intel Nehalem-EX

Iwainsky, Christian and Reichstein, Thomas and Dahnken, Christopher and an Mey, Dieter and Terboven, Christian and Semin, Andrey and Bischof, Christian Guarracino, M. and Vivien, F. and Träff, J. and Cannatoro, M. and Danelutto, M. and Hast, A. and Perla, F. and Knüpfer, A. and Di Martino, B. and Alexander, M. (eds.) (2011):
An Approach to Visualize Remote Socket Traffic on the Intel Nehalem-EX.
In: Lecture Notes in Computer Science, 6586, In: Euro-Par 2010 Parallel Processing Workshops, 1.Auflage, pp. 523-530, Berlin / Heidelberg, Springer, ISBN 978-3-642-21877-4,
DOI: 10.1007/978-3-642-21878-1_64,
[Book Section]

Abstract

The integration of the memory controller on the processor die enables ever larger core counts in commodity hardware shared memory systems with Non-Uniform Memory Architecture properties. Shared memory parallelization with OpenMP is an elegant and widely used approach to leverage the power of such systems. The binding of the OpenMP threads to compute cores and the corresponding memory association are becoming even more critical in order to obtain optimal performance. In this work we provide a method to measure the amount of remote socket memory accesses a thread generates. We use available performance monitoring CPU counters in combination with thread binding on a quad socket Nehalem EX system. For visualization of the collected data we use Vampir.

Item Type: Book Section
Erschienen: 2011
Editors: Guarracino, M. and Vivien, F. and Träff, J. and Cannatoro, M. and Danelutto, M. and Hast, A. and Perla, F. and Knüpfer, A. and Di Martino, B. and Alexander, M.
Creators: Iwainsky, Christian and Reichstein, Thomas and Dahnken, Christopher and an Mey, Dieter and Terboven, Christian and Semin, Andrey and Bischof, Christian
Title: An Approach to Visualize Remote Socket Traffic on the Intel Nehalem-EX
Language: English
Abstract:

The integration of the memory controller on the processor die enables ever larger core counts in commodity hardware shared memory systems with Non-Uniform Memory Architecture properties. Shared memory parallelization with OpenMP is an elegant and widely used approach to leverage the power of such systems. The binding of the OpenMP threads to compute cores and the corresponding memory association are becoming even more critical in order to obtain optimal performance. In this work we provide a method to measure the amount of remote socket memory accesses a thread generates. We use available performance monitoring CPU counters in combination with thread binding on a quad socket Nehalem EX system. For visualization of the collected data we use Vampir.

Title of Book: Euro-Par 2010 Parallel Processing Workshops
Series Name: Lecture Notes in Computer Science
Volume: 6586
Number: 6586
Place of Publication: Berlin / Heidelberg
Publisher: Springer
Edition: 1.Auflage
ISBN: 978-3-642-21877-4
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Scientific Computing
Zentrale Einrichtungen
Date Deposited: 22 Mar 2013 10:34
DOI: 10.1007/978-3-642-21878-1_64
Corresponding Links:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details