Withdraw
Loading…
Some advances in Bayesian inference and generative modeling
Tang, Rong
Loading…
Permalink
https://hdl.handle.net/2142/120227
Description
- Title
- Some advances in Bayesian inference and generative modeling
- Author(s)
- Tang, Rong
- Issue Date
- 2023-04-05
- Director of Research (if dissertation) or Advisor (if thesis)
- Yang, Yun
- Doctoral Committee Chair(s)
- Yang, Yun
- Committee Member(s)
- Chen, Xiaohui
- Liang, Feng
- Zhu, Ruoqing
- Department of Study
- Statistics
- Discipline
- Statistics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Bayesian inference
- Generative modeling
- Variational autoencoder
- Minimax rate
- Adversarial losses
- Mixing time
- Abstract
- Statisticians often seek the ability to work with complex statistical models that can provide a comprehensive understanding of complicated data sets without making restrictive assumptions on their distribution. In machine learning, (deep) generative modeling has achieved great success in learning complicated high-dimensional distributions, such as creating synthetic images and texts with some variations. Instead of seeking a mathematical expression for the probability, generative modeling aims to obtain a generator (encoder) that can map samples from a simple prior distribution to points that resemble the given data. Presenting a complicated distribution through a generative model allows for efficient implementation, as generators are usually easier to optimize than distributions with constraints. Bayesian inference is another popular approach for analyzing complex data and has great applications in various fields, such as social science, biomedicine, genomics, and signal processing. Unlike generative modeling, Bayesian inference aims to specify a full probability model and perform inference conditional on both the model and data. In this dissertation, we will focus on both generative modeling and Bayesian inference. The first part of this dissertation aims to present practitioners with some advances in generative modeling. In the first project of the dissertation, we focus on a popular class of generative modeling methods, called variational autoencoders (VAE). We establish a theoretical framework for analyzing the excess risk associated with VAE under the setting of density estimation, covering both parametric and nonparametric cases, through the lens of M-estimation. The second project addresses a follow-up problem concerning the fundamental limit of estimating unknown manifold-supported distribution in terms of minimax rate under adversarial losses. The established minimax rate clarifies how various problem characteristics, including data intrinsic dimension, target distribution and manifold smoothness levels, affect the fundamental limit of high-dimensional distribution estimation. In the third project, we apply some of the techniques from the second project to study nonparametric two-sample hypothesis testing under adversarial losses. We characterize the optimal detection boundary of two-sample testing in terms of the dimensionalities and smoothness levels of the underlying densities and the discriminator class defining the adversarial loss. Furthermore, we propose a testing procedure that can simultaneously attain the optimal detection boundary under many common adversarial losses, including those induced by the $\ell_1$, $\ell_2$ distances and Wasserstein distances. The second part of this dissertation aims to investigate research problems in the field of Bayesian inference. In the fourth project, we investigate the feasibility of conducting valid Bayesian uncertainty quantification in empirical risk minimization. To achieve this, we propose a novel Bayesian inferential approach that substitutes the (misspecified or partly specified) likelihoods with proper exponentially tilted empirical likelihoods plus a regularization term. We demonstrate that the Bayesian credible regions derived from the proposed posterior are automatically calibrated to deliver valid uncertainty quantification. We also consider extensions of our method to high-dimensional models under sparsity constraints by incorporating sparsity-inducing priors. In the fifth project, we delve into the Bayesian computational aspect by analyzing the computational complexity of sampling from a Bayesian posterior using the Metropolis-adjusted Langevin algorithm (MALA). We establish that the optimal parameter dimension dependence of MALA's non-asymptotic mixing time after the burn-in period, when the target Bayesian posterior is near Gaussian, is $d^{1/3}$.
- Graduation Semester
- 2023-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2023 Rong Tang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…