Processor parallelism considerations and memory latency reduction in shared memory multiprocessors

Lilja, David John

Processor parallelism considerations and memory latency reduction in shared memory multiprocessors

Lilja, David John

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/22386

Description

Title

Processor parallelism considerations and memory latency reduction in shared memory multiprocessors

Author(s)

Lilja, David John

Issue Date

1991

Doctoral Committee Chair(s)

Yew, Pen-Chung

Department of Study

Electrical and Computer Engineering

Discipline

Electrical and Computer Engineering

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Engineering, Electronics and Electrical
Computer Science

Language

eng

Abstract

A wide variety of computer architectures have been proposed to exploit parallelism at different granularities. These architectures have significant differences in instruction scheduling constraints, memory latencies, and synchronization overhead, making it difficult to determine which architecture can achieve the best performance on a given program. Trace-driven simulations and analytic models are used to compare the instruction-level parallelism of a superscalar processor and a pipelined processor with the loop-level parallelism of a shared memory multiprocessor. It is shown that the maximum speedup for a loop with a cyclic dependence graph is limited by its critical dependence ratio, independent of the number of iterations in the loop. The fine-grained processors are better suited for executing these loops with cyclic dependence graphs, while the multiprocessor has better performance on the very parallel loops with acyclic dependence graphs. When executing programs with a variety of loops and sequential code, the best performance is obtained using a multiprocessor architecture in which each individual processor has a fine-grained parallelism of two to four.
A major problem with this type of shared memory multiprocessor architecture is the long latency in fetching operands from the shared memory. Private data caches are an effective means of reducing this latency, but they introduce the complexity of a cache coherence mechanism. Both hardware and software schemes have been proposed for maintaining coherence in these systems. Unfortunately, hardware schemes have very high memory requirements, and software schemes rely on imprecise compile-time memory disambiguation. A new compiler-assisted directory coherence mechanism is proposed that combines the best aspects of the hardware and software approaches while eliminating many of their disadvantages. The pointer cache directory significantly reduces the size of a hardware directory by dynamically binding pointers to cache blocks only when the blocks are actually referenced. Compiler optimizations can further reduce the size of the directory by signaling the hardware to allocate pointers only when they are needed. Detailed trace-driven simulations show that the performance of this new approach is comparable to other coherence schemes, but with significantly lower memory requirements.

Type of Resource

text

Permalink

http://hdl.handle.net/2142/22386

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Electrical and Computer Engineering

Dissertations and Theses in Electrical and Computer Engineering

Processor parallelism considerations and memory latency reduction in shared memory multiprocessors

Lilja, David John

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In