Withdraw
Loading…
Exploiting software information for an efficient memory hierarchy
Komuravelli, Rakesh
Loading…
Permalink
https://hdl.handle.net/2142/72791
Description
- Title
- Exploiting software information for an efficient memory hierarchy
- Author(s)
- Komuravelli, Rakesh
- Issue Date
- 2015-01-21
- Director of Research (if dissertation) or Advisor (if thesis)
- Adve, Sarita V.
- Doctoral Committee Chair(s)
- Snir, Marc
- Committee Member(s)
- Adve, Sarita V.
- Adve, Vikram S.
- Hwu, Wen-Mei W.
- Iyer, Ravi
- Pokam, Gilles
- Montesinos, Pablo
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Computer architecture
- cache coherence
- multicores
- heterogeneous systems
- protocol verification
- memory hierarchy
- Abstract
- Power consumption is one of the most important factors in the design of today’s processor chips. Multicore and heterogeneous systems have emerged to address the rising power concerns. Since the memory hierarchy is becoming one of the major consumers of the on-chip power budget in these systems, designing an efficient memory hierarchy is critical to future systems. We identify three sources of inefficiencies in memory hierarchies of today’s systems: (a) coherence, (b) data communication, and (c) data storage. This thesis takes the stand that many of these inefficiencies are a result of today’s software-agnostic hardware design. There is a lot of information in the software that can be exploited to build an efficient memory hierarchy. This thesis focuses on identifying some of the inefficiencies related to each of the above three sources, and proposing various techniques to mitigate them by exploiting information from the software. First, we focus on inefficiencies related to coherence and communication. Today’s hardware based directory coherence protocols are extremely complex and incur unnecessary overheads for sending invalidation messages and maintaining sharer lists. We propose DeNovo, a hardware-software co-designed protocol, to address these issues for a class of programs that are deterministic. DeNovo assumes a disciplined programming environment and exploits features such as structured parallel control, data-race-freedom, and software information about data access patterns to build a system that is simple, extensible, and performance-efficient compared to today’s protocols. We also extend DeNovo to add two optimizations to address the inefficiencies related to data communication, specifically, aimed at reducing the unnecessary on-chip network traffic. We show that adding these two optimizations did not only result in addition of zero new states (or transient states) to the protocol but also provided performance and energy gains to the system, thus validating the extensibility of the DeNovo protocol. Together with the two communication optimizations DeNovo reduces the memory stall time by 32% and the network traffic by 36% (resulting in direct savings in energy) on average compared to a state-of-the-art implementation of the MESI protocol for the applications studied. Next we address the inefficiencies related to data storage. Caches and scratchpads are two popular organizations for storing data in today’s systems but they both have inefficiencies. Caches are power-hungry incurring expensive tag lookups and scratchpads incur unnecessary data movement as they are only locally visible. To address these problems, we propose a new memory organization, stash, which has the best of both cache and scratchpad organizations. Stash is a globally visible unit and its functionality is independent of the coherence protocol employed. In our implementation, we extend DeNovo to provide coherence for stash. Compared to a baseline configuration that has both scratchpad and cache accesses, we show that the stash configuration (in which scratchpad and cache accesses are converted to stash accesses), even with today’s applications that do not fully exploit stash, reduces the execution time by 10% and the energy consumption by 14% on average. Overall, this thesis shows that a software-aware hardware design can effectively address many of the inefficiencies found in today’s software oblivious memory hierarchies.
- Graduation Semester
- 2014-12
- Permalink
- http://hdl.handle.net/2142/72791
- Copyright and License Information
- Copyright 2014 Rakesh Komuravelli
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…