Achieving performance portability across parallel accelerator architectures
Kofsky, Stephen
Loading…
Permalink
https://hdl.handle.net/2142/44240
Description
Title
Achieving performance portability across parallel accelerator architectures
Author(s)
Kofsky, Stephen
Issue Date
2013-05-24T21:55:16Z
Director of Research (if dissertation) or Advisor (if thesis)
Lumetta, Steven S.
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Performance Portability
CUDA
Parallel Programming
Rigel
Abstract
Parallel programming requires a significant amount of developer effort, and creating optimized parallel code is even more time-consuming. In the end, tuned parallel codes typically only perform well for a single architecture, or even microarchitecture. This thesis focuses on SPMD code written in CUDA, noting that programs must obey a number of constraints to achieve high performance on an NVIDIA GPU. Under such constraints, source-level optimizations can improve the performance of CUDA code on Rigel, a MIMD accelerator architecture currently under development. Source-level optimizations can produce code for Rigel that runs significantly faster than naïve translations. In some cases, benchmarks run nearly four times faster, rivaling the performance of hand-optimized code. Unlike a GPU, Rigel allows for a flexible execution model, making it difficult to extract performance information that can be leveraged to get good performance on other architectures. CUDA code written for Rigel performs poorly when executed on a GPU, and is significantly slower than optimized CUDA code tuned for GPUs.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.