Optimizing Sparse Matrix-Matrix Multiplication for the GPU
Dalton, Steven; Bell, Nathan; Olson, Luke
Loading…
Permalink
https://hdl.handle.net/2142/42667
Description
Title
Optimizing Sparse Matrix-Matrix Multiplication for the GPU
Author(s)
Dalton, Steven
Bell, Nathan
Olson, Luke
Issue Date
2013-03-26
Keyword(s)
parallel
sparse
gpu
matrix-matrix
Abstract
Sparse matrix-matrix multiplication (SpMM) is a key operation in numerous ar-
eas from information to the physical sciences. Implementing SpMM efficiently on
throughput-oriented processors, such as the graphics processing unit (GPU), requires
the programmer to expose substantial fine-grained parallelism while conserving the
limited off-chip memory bandwidth. Balancing these concerns, we decompose the
SpMM operation into three, highly-parallel phases: expansion, sorting, and compres-
sion, and introduce a set of complementary bandwidth-saving performance optimiza-
tions. Our implementation is fully general and our optimizations lead to substantial
efficiencies for a SpMM product.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.