LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Harisrikanth, Keshav

LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Harisrikanth, Keshav

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/107274

Description

Title

LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Author(s)

Harisrikanth, Keshav

Contributor(s)

Chen, Deming

Issue Date

2020-05

Keyword(s)

FPGA
Embedded
Accelerator architectures
Reconfigurable architectures
LSTM
Quantization

Abstract

Neural networks represent a complex computation which can be extremely resource intensive. This can limit their usability in contexts where very small amounts of hardware are deployed on low power budgets. One key way in which the computational cost of neural networks can be significantly reduced is quantization, in which the values throughout the network are represented in fewer bits. A ternarized network is specifically a network in which every weight has been quantized to three values, +1,-1 and 0. Past works have shown that, despite their simple weight systems, ternarized neural networks can achieve much closer accuracy to full floating point networks than might be expected. In order to further extract computational efficiency from these networks, we have designed and analyzed DNN acceleration on embedded FPGAs through the creation of a ternarized deep neural network coprocessor with custom designed ISA. We have prior built a basic ternarized neural network accelerator capable of basic CNN computation. A key point of improvement for this design over past works is specifically supporting LSTM operations, and the efficient hardware specifically for LSTM computation. This continuing work faces the significant challenge of making ISAs that are both general enough for any task, but specific enough to be well executed in hardware, and reduce code density. This thesis in particular will primarily focus on the LSTM hardware computation units in itself.

Type of Resource

text

Language

Permalink

http://hdl.handle.net/2142/107274

Owning Collections

Senior Theses - Electrical and Computer Engineering PRIMARY

The best of ECE undergraduate research

LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Harisrikanth, Keshav

Permalink

Description

Owning Collections

Senior Theses - Electrical and Computer Engineering PRIMARY

Log In