Reliability Models for Double Chipkill Detect/Correct Memory Systems
Author(s)
Jian, Xun
Blanchard, Sean
Debardeleben, Nathan
Sridharan, Vilas
Kumar, Rakesh
Issue Date
2013-03
Keyword(s)
ECC
Memory
Chipkill correct
Modeling
Reliability
Abstract
Chipkill correct is an advanced type of error correction used in memory subsystems. Existing analytical approaches for modeling the reliability of memory subsystems with chipkill correct are limited to those with chipkill correct solutions that can only guarantee correction of errors in a single DRAM device. However, chipkill correct solutions capable of guaranteeing the detection and even correction of errors in up to two DRAM devices have become common in existing HPC systems. Analytical reliability models are needed for such memory subsystems. This paper proposes analytical models for the reliability of double chipkill detect and/or correct. Validation against Monte Carlo simulations shows that the outputs of our analytical models are within 3.9% of Monte Carlo simulations, on average.
Publisher
Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.