Trainability and generalization of small-scale neural networks
Song, Myung Hwan
Loading…
Permalink
https://hdl.handle.net/2142/104821
Description
Title
Trainability and generalization of small-scale neural networks
Author(s)
Song, Myung Hwan
Issue Date
2019-04-22
Director of Research (if dissertation) or Advisor (if thesis)
Sun, Ruoyu
Department of Study
Industrial&Enterprise Sys Eng
Discipline
Industrial Engineering
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Date of Ingest
2019-08-23T19:51:51Z
Keyword(s)
Deep Learning
Neural Networks
Learning Theory
Abstract
As deep learning has become solution for various machine learning, artificial intelligence applications, their architectures have been developed accordingly. Modern deep learning applications often use overparameterized setting, which is opposite to what conventional learning theory suggests. While deep neural networks are considered to be less vulnerable to overfitting even with their overparameterized architecture, this project observed that properly trained small-scale networks indeed outperform its larger counterparts. The generalization ability of small-scale networks has been overlooked in many researches and practice, due to their extremely slow convergence speed. This project observed that imbalanced layer-wise gradient norm can hider overall convergence speed of neural networks, and narrow networks are vulnerable to this. This projects investigates possible reasons of convergence failure of small-scale neural networks, and suggests a strategy to alleviate the problem.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.