Withdraw
Loading…
Applications of diffusion processes: machine learning, optimization, and sampling
Tzen, Belinda
Loading…
Permalink
https://hdl.handle.net/2142/115878
Description
- Title
- Applications of diffusion processes: machine learning, optimization, and sampling
- Author(s)
- Tzen, Belinda
- Issue Date
- 2022-07-01
- Director of Research (if dissertation) or Advisor (if thesis)
- Raginsky, Maxim
- Doctoral Committee Chair(s)
- Raginsky, Maxim
- Committee Member(s)
- Jiang, Nan
- Rigollet, Philippe
- Srikant, Rayadurgam
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- machine learning
- theory
- Abstract
- Many problems in machine learning and statistics today exhibit tremendous size in various parameters, to the point that they can often be conceived of as infinite. Doing so enables the use of mathematical tools for continuous systems in their analysis, which entails a careful consideration of the relationship between the discrete system in question and its continuous approximation. The work presented in this thesis uses this approach to study problems in optimization, sampling, and learning, by modeling the phenomena of interest with diffusion processes. In the first part, we bridge a short- and long-timescale understanding of optimization in non-convex settings using a noisy gradient-based method, the Langevin algorithm, a discretization of the eponymous diffusion process. In the second part, we use a control-theoretic framework to examine generative models specified by infinite noisy function compositions, corresponding to nonlinear diffusion processes, and demonstrate their expressiveness as well as methods for performing sampling and inference on them. Subsequently, we study the specific case where these models constitute the infinite-depth limit of feedforward neural networks, and discuss computational aspects of performing variational inference with them. Finally, we investigate probability distributions that can be viewed as the mean-field limit of the weights in very wide neural networks, and contrast control-theoretically optimal and Langevin dynamics vis-\`a-vis an entropically regularized risk minimization objective. We show why there is an exponential gap in the general case, and close by illustrating a setting where naive gradient-based dynamics in fact coincide exactly with those specified by optimal control: mirror Langevin dynamics, corresponding to the continuous limit of noisy mirror descent.
- Graduation Semester
- 2022-08
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Belinda Tzen
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…