Optimizing network data movement in server class microprocessors
Agarwal, Siddharth
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/122175
Description
Title
Optimizing network data movement in server class microprocessors
Author(s)
Agarwal, Siddharth
Issue Date
2023-12-06
Director of Research (if dissertation) or Advisor (if thesis)
Kim, Nam Sung
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Networking
Computer Architechutre
Abstract
High-bandwidth network interface cards (NICs), each capable of transferring 100s of Gigabits per second, are making inroads into the servers of next generation data centers. Such unprecedented data delivery rates impose immense pressure, especially on the server’s memory subsystem, as NICs first transfer network data to DRAM before processing.
The cache hierarchy has evolved in response, supporting a direct data input-output (DDIO) technology to place network data directly in the last-level cache (LLC). Subsequently, various policies have been explored to manage such LLCs and have proven effective in reducing service latency and memory bandwidth consumption of network applications. However, the more recent evolution of the cache hierarchy decreased the size of LLC per core but significantly increased that of mid-level cache (MLC) with a non-inclusive policy.
This calls for a re-examination of the DDIO as mentioned above technology and management policies. In this work, we first develop a framework for modeling 100 Gigabit networking in a cycle-accurate architectural simulator. We also identify shortcomings of the static data placement policy, placing network data to LLC first, and the non-inclusive policy with a commercial server system.
We then propose an intelligent direct input-output (IDIO) technology that extends DDIO to MLC and provides three synergistic mechanisms: (1) self- invalidating I/O buffer, (2) network-driven MLC prefetching, and (3) selective direct DRAM access. Our detailed experiments using a full-system simulator show that IDIO significantly reduces data movement (up to 84% MLC and LLC write-back reduction), provides LLC isolation (up to 22% performance improvement), and improves tail latency (up to 38% reduction in 99 percentile latency) for receive-intensive network applications.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.