Measurement-based performance analysis and modeling of parallel systems
Natarajan, Chitra
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/23088
Description
Title
Measurement-based performance analysis and modeling of parallel systems
Author(s)
Natarajan, Chitra
Issue Date
1996
Doctoral Committee Chair(s)
Iyer, Ravishankar K.
Department of Study
Electrical Engineering
Discipline
Electrical Engineering
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Engineering, Electronics and Electrical
Computer Science
Language
eng
Abstract
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and application characteristics all play an important role in determining the overall performance obtained from a parallel system. However, previous studies have mostly looked at the memory or the OS or the network performance in isolation. A global view of the overheads from different system perspectives has been lacking. In this dissertation, we characterize the overheads for large application benchmarks executing on the Cedar shared-memory parallel system from operating system, runtime system parallelization, and global memory and interconnection network contention perspectives.
Parallel systems are often used in multiprogrammed environments. However, the issue of scalability in multiprogrammed shared-memory parallel systems has not been studied before. We investigate the scalability of the Cedar system in multiprogrammed environments and show that there is no performance improvement with scaling for fine-grained loop parallel applications executing in multiprogrammed workloads. We also demonstrate that there is an exponential drop in the overhead due to multiprogramming as the loop granularity is increased. We then propose and implement a self-preemption technique to improve the performance of fine-grained applications in multiprogrammed environments.
To balance the processor performance of parallel systems with sufficient I/O performance, several parallel I/O systems have been developed in recent years. However, very little is understood about their performance. We characterize the performance of the PIOUS parallel I/O system on the DEC Alphacluster, via real system measurements, and show that the message passing processing overheads at the compute and I/O nodes limit the throughput that they can sustain. We also use these measurements to provide realistic input parameters to PioSim, a parallel I/O simulation environment we have developed.
PioSim offers a number of unique features: (1) two architecture models--remote and local disk architecture models, (2) two usage models--simple and intelligent parallel I/O models, and (3) an application-oriented synthetic parallel I/O workload generator, PioSyn, capable of modeling a wide variety of temporal and spatial application file access patterns. We illustrate the potential of PioSim and PioSyn through experiments on the Alphacluster model, for scientific, database, and videoserver workloads.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.