Withdraw
Loading…
Automatic Tuning of Matrix Multiplication Performance on Graphics Hardware
Jiang, Changhao; Snir, Marc
Loading…
Permalink
https://hdl.handle.net/2142/11013
Description
- Title
- Automatic Tuning of Matrix Multiplication Performance on Graphics Hardware
- Author(s)
- Jiang, Changhao
- Snir, Marc
- Issue Date
- 2005-04
- Keyword(s)
- computer graphics
- Abstract
- Graphics hardware's performance is advancing much faster than the performance of conventional microprocessor. In order to utilize the tremendous computing power of these systems, it is critical to tune software to graphics hardware's architectural features. The frequent changes in GPUs' architecture and performance characteristics make it very desirable for such tuning to be automated. This paper implements an automatic tuning system to generate high-performance matrix-multiplication implementation on graphics hardware. The automatic tuning system uses a parameterized code generator to generate multiple versions of matrix multiplication, whose performances are empirically evaluated by actual execution on the target platform. An ad-hoc search engine is employed to search over the implementation space for the version that yields the best performance. In contrast to similar systems on CPUs, which utilize cache blocking, register tiling, instruction scheduling etc. tuning strategies, this paper identifies and exploits several tuning strategies that are unique for graphics hardware. These tuning strategies include optimizing for multiple-render-targets, SIMD instructions with data packing, overcoming limitations on instruction count and dynamic branch instruction. The generated implementations have comparable performance with expert manually tuned version in spite of the significant over-head incurred due to the use of the high-level BrookGPU language. As the first attempt in automatic generation of numerical libraries for graphics hardware, the results from this paper are encouraging.
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/11013
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Owning Collections
Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…