Compiler optimizations for parallel loops with fine-grained synchronization

Chen, Ding-Kai

Compiler optimizations for parallel loops with fine-grained synchronization

Chen, Ding-Kai

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/19194

Description

Title: Compiler optimizations for parallel loops with fine-grained synchronization
Author(s): Chen, Ding-Kai
Issue Date: 1994
Doctoral Committee Chair(s): Yew, Pen-Chung
Department of Study: Computer Science
Discipline: Computer Science
Degree Granting Institution: University of Illinois at Urbana-Champaign
Degree Name: Ph.D.
Degree Level: Dissertation
Keyword(s): Computer Science
Language: eng
Abstract: Loops in scientific and engineering applications provide a rich source of parallelism. In order to obtain a higher level of parallelism, loops with loop-carried dependences, which are largely serialized using the traditional techniques, need to be parallelized with fine-grained synchronization. This approach, so-called DOACROSS parallelization, requires new optimization strategies in order to preserve parallelism while minimizing the amount of inter-processor communication. In this thesis, I examine closely issues involved in the DOACROSS parallelization. There are two focuses in this work: (1) increasing parallelism, and (2) reducing communication overhead. Strategies for four major optimization problems are proposed and described in detail in this thesis. Our statement re-ordering and redundant synchronization optimization can enhance the overlap between iterations with reduced synchronization for loops with uniform dependences (i.e., with constant dependence distance vectors). For loops with non-uniform dependences, a new dependence uniformization scheme is used to compose a small set of uniform dependences which has a small dependence cone. These uniform dependences can be used to preserve all non-uniform dependences, and a small dependence cone size implies a higher speedup and lower communication overhead. Our last loop transformation is for loops with unknown dependences at compile time. It is a runtime parallelization technique with good locality and high concurrency. It has fairly consistent performance because its runtime analysis, which is usually the performance bottleneck of most runtime schemes, requires less global communication. We provide performance measurement and comparison with the schemes previously proposed. The results indicate that these schemes out-perform earlier schemes in terms of higher parallelism and lower communication requirement. These schemes form an integral part of the future high performance parallelizing compilers.
Type of Resource: text
Permalink: http://hdl.handle.net/2142/19194

Compiler optimizations for parallel loops with fine-grained synchronization

Chen, Ding-Kai

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Compiler optimizations for parallel loops with fine-grained synchronization

Chen, Ding-Kai

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In