Efficient inference of convolutional neural networks on general purpose hardware using weight repetition
Agrawal, Rohit
Loading…
Permalink
https://hdl.handle.net/2142/105251
Description
Title
Efficient inference of convolutional neural networks on general purpose hardware using weight repetition
Author(s)
Agrawal, Rohit
Issue Date
2019-04-24
Director of Research (if dissertation) or Advisor (if thesis)
Fletcher, Christopher W.
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Deep Neural Networks
Convolutional Neural Networks
Accelerator
CPU
GPU
Deep Learning Hardware
CNN Inference
Abstract
Deep Neural Networks (DNNs) have begun to permeate all corners of electronic society due to their high accuracy and machine efficiency per operation. Recent work has shown how weights within and across DNN filters have large degrees of repetition due to the pigeonhole principle and modern weight quantization schemes, and that this weight repetition can be harnessed improve DNN inference efficiency in an accelerator/ASIC context. This thesis develops new techniques so that weight repetition leads to an efficiency gain on general-purpose and programmable SIMD-based architectures such as CPUs equipped with vector extensions. We show how to write high-performance software that does not require hardware modifications and can cope with the irregularity introduced by weight repetition schemes. Overall, our highly parallel software kernel achieves up to 1:51 speedup in runtime of inference over state-of-the-art baseline.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.