Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors

Choi, Lynn

Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors

Choi, Lynn

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/20919

Description

Title

Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors

Author(s)

Choi, Lynn

Issue Date

1996

Doctoral Committee Chair(s)

Padua, David A.

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Engineering, Electronics and Electrical
Engineering, System Science
Computer Science

Language

eng

Abstract

Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the temporal and spatial locality of memory reference patterns, private caches can eliminate redundant memory accesses and thereby reduce both average memory latency and network traffic. However, maintaining cache coherence for such systems is still a challenge. Hardware directories can be very effective, but are too expensive for large-scale multiprocessors.
As an alternative, compiler-directed techniques (4, 5, 6, 7, 8, 9, 10, 11, 14) can be used to maintain coherence. In this approach, cache coherence is maintained locally without directory hardware, thus avoiding the complexity and overhead associated with hardware directories. Although the performance of such schemes has been demonstrated through simulations, most of the studies assume either perfect compile-time analysis or analytical models without real compiler implementations (1, 3, 9, 10, 12, 13). It is still unknown how effectively the compiler can detect potentially stale references and what kind of performance can be obtained using a real compiler. Also, most of the compiler-directed coherence schemes proposed to date have not addressed the real cost of the required hardware support. For example, many of the schemes require expensive hardware support and assume a cache organization with single-word cache lines.
This dissertation addresses these hardware and compiler implementation issues and investigates the feasibility and performance of the compiler-directed cache coherence approach. We propose a new compiler-directed scheme that can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors. The scheme can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration also have been addressed. The cost of the required hardware support is minimal and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data flow analysis, have been developed, and implemented in the Polaris parallelizing compiler, and experimentation results on the Perfect Club benchmarks (2) are discussed.

Type of Resource

text

Permalink

http://hdl.handle.net/2142/20919

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Dept. of Computer Science

Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors

Choi, Lynn

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In