Joint Schedule and Layout Autotuning for Sparse Matrices with Compound Entries on GPUs

Mueller-Roemer, Johannes ; Stork, André ; Fellner, Dieter W. (2019)
Joint Schedule and Layout Autotuning for Sparse Matrices with Compound Entries on GPUs.
24th International Symposium on Vision, Modeling, and Visualization. Rostock, Germany (Sep. 30. - Oct. 02., 2019)
doi: 10.2312/vmv.20191324
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Large sparse matrices with compound entries, i.e., complex and quaternionic matrices as well as matrices with dense blocks, are a core component of many algorithms in geometry processing, physically based animation, and other areas of computer graphics. We generalize several matrix layouts and apply joint schedule and layout autotuning to improve the performance of the sparse matrix-vector product on massively parallel graphics processing units. Compared to schedule tuning without layout tuning, we achieve speedups of up to 5:5x. In comparison to cuSPARSE, we achieve speedups of up to 4:7x.

Typ des Eintrags:	Konferenzveröffentlichung
Erschienen:	2019
Autor(en):	Mueller-Roemer, Johannes ; Stork, André ; Fellner, Dieter W.
Art des Eintrags:	Bibliographie
Titel:	Joint Schedule and Layout Autotuning for Sparse Matrices with Compound Entries on GPUs
Sprache:	Englisch
Publikationsjahr:	2019
Veranstaltungstitel:	24th International Symposium on Vision, Modeling, and Visualization
Veranstaltungsort:	Rostock, Germany
Veranstaltungsdatum:	Sep. 30. - Oct. 02., 2019
DOI:	10.2312/vmv.20191324
Kurzbeschreibung (Abstract):	Large sparse matrices with compound entries, i.e., complex and quaternionic matrices as well as matrices with dense blocks, are a core component of many algorithms in geometry processing, physically based animation, and other areas of computer graphics. We generalize several matrix layouts and apply joint schedule and layout autotuning to improve the performance of the sparse matrix-vector product on massively parallel graphics processing units. Compared to schedule tuning without layout tuning, we achieve speedups of up to 5:5x. In comparison to cuSPARSE, we achieve speedups of up to 4:7x.
Freie Schlagworte:	General Purpose Computation on Graphics Processing Unit (GPGPU) GPU computing Linear systems Code generation
Fachbereich(e)/-gebiet(e):	20 Fachbereich Informatik 20 Fachbereich Informatik > Graphisch-Interaktive Systeme
Hinterlegungsdatum:	09 Apr 2020 12:55
Letzte Änderung:	04 Feb 2022 12:39
PPN:
Export:

Suche nach Titel in:	TUfind oder in Google

Frage zum Eintrag

Optionen (nur für Redakteure)

Redaktionelle Details anzeigen

OAI 2.0-Basis-URL: https://tubiblio.ulb.tu-darmstadt.de/cgi/oai2 TUbiblio verwendet EPrints 3.

Drucken |

Impressum |

Datenschutzerklärung