Director of Research (if dissertation) or Advisor (if thesis)
Telgarsky, Matus J
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
implicit
regularization
hinge
loss
Abstract
A new loss function is proposed which learns the hinge loss function an infinite number of times pushing $f(x_i)y_i \to \infty$. It is proven that for a linear model on linearly separable data this modified hinge loss function converges in the direction of the $\ell_2$ max-margin separator at a rate of $\bigO\left( \sqrt{d/t} \right)$ where $d$ is the dimension of the data. Then, an explicit formula for the underlying dynamical system of the gradient descent iterates for two-layer linear networks on the inner product loss function is derived. Using the derived dynamical system, a precise explicit algorithm is developed which when implemented reproduces the gradient descent iterates of two-layer ReLU nets on the inner product exactly. This result is studied further to extrapolate conclusions for neural network optimization.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.