Memory latency reduction via data prefetching and data forwarding in shared memory multiprocessors
Poulsen, David Kristian
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/21242
Description
Title
Memory latency reduction via data prefetching and data forwarding in shared memory multiprocessors
Author(s)
Poulsen, David Kristian
Issue Date
1994
Doctoral Committee Chair(s)
Yew, Pen-Chung
Department of Study
Electrical and Computer Engineering
Discipline
Electrical Engineering
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Engineering, Electronics and Electrical
Computer Science
Language
eng
Abstract
This dissertation considers the use of data prefetching and an alternative mechanism, data forwarding, for reducing memory latency due to interprocessor communication in cache coherent, shared memory multiprocessors. The benefits of prefetching and forwarding are considered for large, numerical application codes with loop-level and vector parallelism. Data prefetching is applied to these applications using two different multiprocessor prefetching algorithms implemented within a parallelizing compiler. Data forwarding considers array references involved in communication-related accesses between successive parallel loops, rather than within a single loop nest. A hybrid prefetching and forwarding scheme and a compiler algorithm for data forwarding are also presented.
EPG-sim, a system of execution-driven simulation tools for studying parallel architectures, algorithms, and applications, was developed as a prerequisite for this work. EPG-sim performs execution-driven simulation and critical path simulation within a single, integrated environment. EPG-sim provides an extremely wide range of cost/accuracy trade-offs and a number of novel features compared to existing execution-driven systems. The parallelism and communication behavior of numerical application codes are studied via EPG-sim critical path simulation, which establishes the potential performance of prefetching and forwarding for these codes. The evaluation of prefetching and forwarding is accomplished via detailed EPG-sim execution-driven simulations of optimized, parallel versions of these application codes.
Two multiprocessor prefetching algorithms are presented and compared. A simple blocked vector prefetching algorithm, considerably less complex than existing software pipelined prefetching algorithms, is shown to be effective in reducing memory latency and increasing performance. A Forwarding Write operation is used to evaluate the effectiveness of forwarding. Data forwarding results in significant performance improvements over data prefetching for codes exhibiting less spatial locality. A new hybrid prefetching and forwarding scheme is presented that provides increased performance stability by adapting to varying application characteristics and architectural parameters. The hybrid scheme is shown to be effective in improving the performance of forwarding in reduced cache sizes. A compiler algorithm for data forwarding is presented that implements point-to-point forwarding, hybrid prefetching and forwarding, and selective forwarding. Software and hardware support for prefetching and forwarding are also discussed.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.