Withdraw
Loading…
Scalable algorithms for Bayesian variable selection
Wang, Jin
Loading…
Permalink
https://hdl.handle.net/2142/92827
Description
- Title
- Scalable algorithms for Bayesian variable selection
- Author(s)
- Wang, Jin
- Issue Date
- 2016-07-14
- Director of Research (if dissertation) or Advisor (if thesis)
- Liang, Feng
- Doctoral Committee Chair(s)
- Liang, Feng
- Committee Member(s)
- Marden, John I.
- Ji, Yuan
- Zhao, Dave
- Park, Trevor
- Department of Study
- Statistics
- Discipline
- Statistics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Variable Selection
- EM
- Ensemble
- Variational Bayes
- Asymptotic Analysis
- Logistic model
- Abstract
- The innovation of modern technologies drives research and development on high-dimensional data analysis in diverse fields, where variable selection plays a pivotal role to ensure credible model estimation. We focus on scalable algorithms for variable selection that can handle large data sets. Firstly, we propose an EM algorithm that returns the MAP estimate of the set of relevant variables. Due to its particular updating scheme, our algorithm can be implemented efficiently. We also show that the MAP estimate returned by our EM algorithm achieves variable selection consistency. In practice, EM algorithm tends to get stuck at local peaks. So we propose an ensemble version: repeatedly apply the EM algorithm on a subset of Bootstrap sample data and then aggregate the results. Empirical studies demonstrate the superior performance of this Bayesian Bootstrap EM algorithm. Secondly, we propose a hybrid computation framework for Bayesian variable selection. This new algorithm SAB is a combination of the classical EM algorithm and the variational Bayes algorithm. It is very fast in handling high dimensional data with a large number of covariates. To address a critical biological problem, we apply SAB to a state-of-art cancer genomics data set with a goal to understand the complex regulatory relationship between miRNAs and mRNAs in cancer. In the third part, we study the asymptotic behavior of the SAB algorithm in detail and prove that SAB achieves the selection consistency, Bayesian consistency and also an oracle property when the number of covariates grows with the sample size exponentially. Lastly, we extend the hybrid framework of Bayesian variable selection to logistic models, where we adopt the Polya-Gamma specification and show that this specification is equivalent as the local approximation method in the variational Bayes framework.
- Graduation Semester
- 2016-08
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/92827
- Copyright and License Information
- Copyright 2016 Jin Wang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…