Task flow: A novel approach to fine-grain wafer-scale parallel computing
Horst, Robert Whiting
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/20291
Description
Title
Task flow: A novel approach to fine-grain wafer-scale parallel computing
Author(s)
Horst, Robert Whiting
Issue Date
1991
Doctoral Committee Chair(s)
Patel, Janak H.
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Engineering, Electronics and Electrical
Computer Science
Language
eng
Abstract
This thesis introduces a parallel computer architecture known as task flow. Simple replicated cells contain both memory packets and processing logic. Each memory packet contains a data field, the next instruction, and a link. Transmission packets flow between memory packets to communicate instructions and temporary data values. Program execution is performed by multiple tasks that communicate through memory variables. Each task may remain local to one cell for extended periods or may flow between cells with as few as one instruction executed in each cell.
SWIFT is a register-transfer simulation that emulates program execution in a ring of task flow cells. SWIFT runs programs generated by a new assembly language, LIFT. Program fragments written in LIFT explore task flow programming of distributed reduction operations, sparse matrix arithmetic, and associative memory. The performance of coarse-grain application synchronization is modeled using test-and-set locking, full/empty bits, and a new locking method called rotating critical sections.
A new linear-array wafer scale integration architecture has been developed for task flow computing. The architecture has low area overhead, yet is effective in tolerating defective cells on the wafer. A simple configuration algorithm guarantees inclusion of all reachable functional cells. The fixed number of logic levels between cells, and phase shifted synchronous clocking, help to reduce cycle times.
The performance of a sparse matrix-vector multiplication program is evaluated with different matrix sizes and densities. A 100-cell SWIFT simulation of a 1,000 by 1,000 sparse matrix shows a speedup of 79 over the best serial algorithm. These results, when extrapolated to the simulation of artificial neural networks, predict a performance of 759 million connections-per-second with 1,000 neurons and 100,000 connections.
The thesis concludes with a discussion of ways in which the basic architecture may be extended through different interconnection networks, increased pipelining, improved memory organization, and other architectural changes. The thesis is put into perspective by comparing task flow to other parallel architectures and by comparing SWIFT to other parallel machines.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.