A procedure for tightly integrating accelerators into out-of-order processors
Jin, Robert
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/115624
Description
Title
A procedure for tightly integrating accelerators into out-of-order processors
Author(s)
Jin, Robert
Issue Date
2022-04-29
Director of Research (if dissertation) or Advisor (if thesis)
Kim, Nam Sung
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
accelerator
out-of-order processor
Abstract
It is no secret that hardware accelerators are commonly used today to accelerate a wide range of application domains from machine learning to graph processing. Efficient as they are, these accelerators are not standalone and generally serve as a co-processor integrated in some fashion to the main CPU of the system. Accelerator integration can be broadly categorized as on-CPU (tightly-coupled) or off-CPU (loosely-coupled). Methods of off-CPU integration with the accelerator residing on a system bus or interconnect are generally well-explored and already developed today with a host of existing industry standards and protocols. This work tackles the problem of in-CPU integration, which is more complex and intrusive as it requires modification of the CPU itself, but in turn allows low latency data transfer. Our goal is to propose a procedure that generalizes an approach that makes the CPU-tight integration process simplified and reusable for typical out-of-order processors. We introduce a CPU-accelerator hardware interface that allows customization of communication, data transfer, and resource sharing depending on specific accelerator needs. Specifically, this interface allows for low latency communication, register sharing, and simplified control transfer to ensure correct execution. Our interface is implemented on an out-of-order RSD core for evaluation, and then we study how different types of accelerators will use this interface to communicate with the CPU.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.