Cache Design and Performance in a Large-Scale Shared-Memory Multiprocessor System
Chen, Yung-Chin
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/71991
Description
Title
Cache Design and Performance in a Large-Scale Shared-Memory Multiprocessor System
Author(s)
Chen, Yung-Chin
Issue Date
1993
Doctoral Committee Chair(s)
Veidenbaum, Alexander V.
Department of Study
Electrical Engineering
Discipline
Electrical Engineering
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Engineering, Electronics and Electrical
Abstract
The use of a private cache in each processor of large-scale shared-memory multiprocessor systems can reduce long global memory latency but also introduces the cache coherence problem. Cache design and performance in a large-scale multiprocessor are affected by the cache coherence problem and the cache coherence scheme implemented. The behavior of a parallel program usually differs from that of the same program executed sequentially. Consequently, the cache behaves differently and may not perform as well as the cache in a uniprocessor system. Some results of previous cache studies for a uniprocessor system are less applicable to multiprocessor caches. In this thesis, the cache design and performance using a directory and a software coherence scheme in multistage-interconnection-network-based multiprocessor systems are studied using trace-driven timing simulation of numerical benchmarks. Design complexity and performance trade-offs for both schemes are studied. Their performance problems are analyzed in detail, and several improvements are proposed and evaluated and are shown to be effective in improving the performance. Next, the performance of the directory and software schemes are compared; the simple software scheme is shown to have better performance for numerical programs. The performance advantages and disadvantages of the two schemes are analyzed, and a new coherence scheme combining the best of both schemes is proposed. This new scheme is shown to achieve higher hit ratios. Overall, the global memory remains one of the major performance bottlenecks for a multiprocessor system even though private caches are being used. The effectiveness of memory caches to reduce global memory access latency is demonstrated.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.