Withdraw
Loading…
Fast algorithms for Bayesian variable selection
Huang, Xichen
Loading…
Permalink
https://hdl.handle.net/2142/98201
Description
- Title
- Fast algorithms for Bayesian variable selection
- Author(s)
- Huang, Xichen
- Issue Date
- 2017-07-10
- Director of Research (if dissertation) or Advisor (if thesis)
- Liang, Feng
- Doctoral Committee Chair(s)
- Liang, Feng
- Committee Member(s)
- Fellouris, Georgios
- Qu, Annie
- Shao, Xiaofeng
- Department of Study
- Statistics
- Discipline
- Statistics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Date of Ingest
- 2017-09-29T17:45:38Z
- Keyword(s)
- Bayesian variable selection
- Variational Bayesian methods
- Online learning
- Abstract
- Variable selection of regression and classification models is an important but challenging problem. There are generally two approaches, one based on penalized likelihood, and the other based on Bayesian framework. We focus on the Bayesian framework in which a hierarchical prior is imposed on all unknown parameters including the unknown variable set. The Bayesian approach has many advantages, for example, we can access unknown obtain the posterior distribution of the sub-models. And more accurate prediction may be obtained by model averaging. However, as the posterior distribution of the model parameters is usually not in closed form, posterior inference that relies on Markov Chain Monte Carlo (MCMC) has high computational cost especially in high-dimensional settings, which makes Bayesian approaches undesirable. In order to deal with datasets with large number of features, we aim to develop fast algorithms for Bayesian variable selection, which approximate the true posterior distribution, but yet still return the right inference (at least asymptotically). In this thesis, we start with a variational algorithm for linear regression. Our algorithm is based on the work by Carbonetto and Stephens (2012), and with essential modifications including updating scheme and truncation of posterior inclusion probabilities. We have shown that our algorithm achieves both frequentist and Bayesian variable selection consistency. Then we extend our variational algorithm to logistic regression by incorporating the Polya-Gamma data-augmentation trick (Polson et al., 2013), which links our algorithm for linear regression with logistic regression. However, as the variational algorithm needs to update the variational distribution of all the latent Polya-Gamma random variables of the same size of the observations at every iteration, this algorithm is slow when there are huge amount of observations, or even be infeasible when the data is too large to be loaded into computer memory. We propose an online algorithm for the logistic regression, under the framework of online convex optimization. Our algorithm is fast, and achieves similar accuracy (log-loss) as the state-of-art algorithm (Follow-the-Regularized-Proximal algorithm).
- Graduation Semester
- 2017-08
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/98201
- Copyright and License Information
- Copyright 2017 Xichen Huang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…