Withdraw
Loading…
On the characterization of the global landscape of neural networks
Li, Dawei
Loading…
Permalink
https://hdl.handle.net/2142/115894
Description
- Title
- On the characterization of the global landscape of neural networks
- Author(s)
- Li, Dawei
- Issue Date
- 2022-07-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Sun, Ruoyu
- Doctoral Committee Chair(s)
- Sun, Ruoyu
- Committee Member(s)
- Chen, Xin
- Srikant, Rayadurgam
- Etesami, Seyed Rasoul
- Department of Study
- Industrial&Enterprise Sys Eng
- Discipline
- Industrial Engineering
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Landscape
- Neural Network
- Deep Learning
- Non-convex Optimization
- Abstract
- Understanding why deep neural networks perform well has attracted much attention recently. The non-convexity of the associated loss functions, which may cause a bad landscape, is one of the major concerns for neural network training, but the recent success of neural networks suggests that their loss landscape is not too bad. Nevertheless, a systematic characterization of the landscape is still yet to be done. In this thesis, we aim at a more complete understanding of the global landscape of neural networks. In the first part, we study the existence of sub-optimal local minima for multi-layer networks. In particular, we prove that for neural networks with generic input data and smooth nonlinear activation functions, sub-optimal local minima can exist, no matter how wide the network is (as long as the last hidden layer has at least two neurons). This result overturns a classical result claiming that ``there exists no sub-optimal local minimum for 1-hidden-layer wide neural nets with sigmoid activation function''. Moreover, it indicates that sub-optimal local minima are common for wide neural nets. Given that we cannot eliminate sub-optimal local minima for neural networks, a natural question is: what is the true landscape of neural networks? Specifically, does width affect the landscape? In the second part, we prove two results: on the positive side, for any continuous activation functions, the loss surface of a class of wide networks has no sub-optimal basin, where ``basin'' is defined as the set-wise strict local minimum; on the negative side, for a large class of networks with width below a threshold, we construct strict local minima that are not globally optimal. These two results together show the phase transition in landscape from narrow to wide networks and indicate the benefit of width as well. In the last part, we move on to explore how the previously mentioned phase transition occurs via the ``generative mechanism'' of stationary points. We study a certain transformation called ``neuron splitting'', which maps a stationary point in narrower networks into stationary points in wider networks. We provide sufficient conditions on which the stationary points of the wider networks are local minima or saddle points: under certain conditions, a local minimum is mapped to a high-dimensional plateau that contains both local minima and saddles of an arbitrarily wide network, while any saddle points can only be mapped to saddle points of wider networks by neuron splitting. These results altogether characterize the properties of stationary points in neural networks: the existence in different settings, the location and shape, as well as the evolution of stationary points when restructuring the neural networks. They not only provide a deeper understanding of the success of the current wide neural networks, but also propose potential methods to tackle the difficulties in training smaller neural networks.
- Graduation Semester
- 2022-08
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Dawei Li
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…