Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Dalton, Steven; Bell, Nathan; Olson, Luke

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Dalton, Steven; Bell, Nathan; Olson, Luke

Content Files

spmm_tr.pdf

Permalink

https://hdl.handle.net/2142/42667

Description

Title

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Author(s)

Dalton, Steven
Bell, Nathan
Olson, Luke

Issue Date

2013-03-26

Keyword(s)

parallel
sparse
gpu
matrix-matrix

Date of Ingest

2013-03-26T13:41:10Z

Abstract

Sparse matrix-matrix multiplication (SpMM) is a key operation in numerous ar- eas from information to the physical sciences. Implementing SpMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpMM operation into three, highly-parallel phases: expansion, sorting, and compres- sion, and introduce a set of complementary bandwidth-saving performance optimiza- tions. Our implementation is fully general and our optimizations lead to substantial efficiencies for a SpMM product.

Type of Resource

text

Genre of Resource

Technical Report
Article

Language

Permalink

http://hdl.handle.net/2142/42667

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Dalton, Steven; Bell, Nathan; Olson, Luke

Permalink

Description

Owning Collections

Research and Tech Reports - Computer Science PRIMARY

Optimizing Sparse Matrix-Matrix Multiplication for the GPU

Dalton, Steven; Bell, Nathan; Olson, Luke

Permalink

Description

Owning Collections

Research and Tech Reports - Computer Science PRIMARY

Log In