Withdraw
Loading…
Learning compact neural network representations with structural priors
Han, Wei
Loading…
Permalink
https://hdl.handle.net/2142/105057
Description
- Title
- Learning compact neural network representations with structural priors
- Author(s)
- Han, Wei
- Issue Date
- 2019-04-18
- Director of Research (if dissertation) or Advisor (if thesis)
- Huang, Thomas S.
- Doctoral Committee Chair(s)
- Huang, Thomas S.
- Committee Member(s)
- Hasegawa-Johnson, Mark
- Liang, Zhi-Pei
- Hwu, Wen-Mei
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- neural networks
- recurrent neural networks
- convolutional neural networks
- Abstract
- The development of deep neural networks has taken two directions. On one hand, the networks become deeper and wider, employing drastically more model parameters and consuming more training data. On the other hand, the simplification of the internal structure of the neural networks has also contributed to the success of neural networks. Notably, two important families of neural networks, convolutional neural networks (CNN) and recurrent neural networks (RNN), both introduce certain structural priors and share model parameters internally to simplify the network. In this dissertation, we investigate a few alternative neural network structural priors to learn more compact CNN and RNN models and achieve better parameter-to-performance ratios. We have developed these neural network structures at three different abstraction levels with different target applications for each. First of all, motivated by the filter redundancy in convolutional neural networks, we have studied parameter sharing across filters in a convolutional layer. Instead of the conventional approach treating CNN filters as a set of independent model parameters, we explore the 2D spatial correlation among filters and propose sharing the filter parameters as if they are overlapping slices from a shared 3D tensor. Experiments show the proposed approach can effectively reduce the number of parameters in several state-of-the-art CNN architectures while still maintaining competitive performance. The second problem studied in this dissertation concerns the inter-layer connectivity pattern in RNNs, the family of neural network models specifically designed for time-series data. One longstanding challenge in RNNs is the vanishing gradient problem that hinders the model's capability to learn long-term dependency, the temporal data dependency across many time steps. Motivated by the recent development from two largely unrelated fields---general application of skip-connection to neural networks, and dilated convolution in image and audio related problems---we propose \textsc{DilatedRNN}, a simple but principled way to construct multi-layer RNNs using multi-resolution recurrent skip connections. The proposed method is conceptually similar to dilated convolution but takes full advantage of the modeling power of RNNs. We show that \textsc{DilatedRNN} is effective particularly in the problems where long-term dependencies are crucial. Finally, inspired by the structural similarity between CNNs and unrolled single layer RNNs, we also study the parameter sharing across different network layers that may be sequentially connected. Specifically, we focus on the family of feedforward CNNs that have an equivalent RNN form and tie the layer parameters with respect to the standard RNN unrolling rule. Empirically we have found that this family of models not only provides desirable balance between model complexity and performance, but also leads to some novel architecture that can be easily combined with domain knowledge.
- Graduation Semester
- 2019-05
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/105057
- Copyright and License Information
- Copyright 2019 Wei Han
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…