Communication and Synchronization in Parallel Computation
Jayasimha, Doddaballapur Narasimha-Murthy
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/69600
Description
Title
Communication and Synchronization in Parallel Computation
Author(s)
Jayasimha, Doddaballapur Narasimha-Murthy
Issue Date
1988
Doctoral Committee Chair(s)
Loui, Michael C.
Lawrie, Duncan H.,
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Computer Science
Abstract
In this thesis we study communication and synchronization overheads in parallel algorithms executing on shared memory multiprocessors.
In the first part, we focus on the communication requirements of parallel algorithms. We define a communication complexity measure for parallel algorithms, modeling the execution of a parallel algorithm on a shared memory multiprocessor by a directed acyclic graph (DAG). We derive a lower bound on the communication for a particular class of graphs corresponding to nested iterative loops. For the DAG representing a triangular solver of size n, we show that the communication (c) and parallel time (t) product ct = $\Omega$($n\sp3$) irrespective of the number of processors used. This result implies there exists a trade-off between computation time and communication. Finally, we define a synchronization complexity measure.
In the second part, we consider the commonly required forms of synchronization in a multiprocessor: barrier, reporting, and mutual exclusion. We study three schemes to implement barrier synchronization and develop analytical models to estimate their performance. For n processes, the first scheme requires $O(n$ log $n$) accesses as opposed to ($2n$ - 1), the minimum number of accesses. The second scheme, which uses a synchronization tree, requires $O(log\ n)$ accesses to any node of the tree. The third scheme has a linear growth in the number of accesses (approximately 3n). We study two of the synchronization schemes through simulation. The simulator is trace driven and is based on the Omega interconnection network. The simulation results are in broad agreement with our analytical results.
We next introduce a new synchronization primitive, the distributed synchronizer. An efficient implementation of this primitive requires simple hardware enhancements at the switching elements of the processor-memory multistage interconnection network. The primitive implements reporting and the barrier with negligible overheads, and the semaphore operations are bounded fair.
Finally, we study bounded fairness. Bounded fairness is stronger than the usual notion of fairness based on eventuality. We give applications of the bounded fairness concept. We formalize bounded fairness by introducing a new operator $\langle k\rangle$ into temporal logic. We show that the new logic is more powerful than the temporal logic with the eventuality operator, and is complete and satisfiable.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.