Withdraw
Loading…
Change point detection for high dimensional data and valid inference for Bayesian linear models
Wu, Teng
Loading…
Permalink
https://hdl.handle.net/2142/110718
Description
- Title
- Change point detection for high dimensional data and valid inference for Bayesian linear models
- Author(s)
- Wu, Teng
- Issue Date
- 2021-04-23
- Director of Research (if dissertation) or Advisor (if thesis)
- Shao, Xiaofeng
- Narisetty, Naveen Naidu
- Doctoral Committee Chair(s)
- Shao, Xiaofeng
- Narisetty, Naveen Naidu
- Committee Member(s)
- Li, Bo
- Yang, Yun
- Department of Study
- Statistics
- Discipline
- Statistics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Change point detection
- Heteroscedasticity
- Sequential monitoring
- Quantile regression
- Bayesian modeling
- High dimensional statistics
- Linear model
- Abstract
- We propose statistical methodologies for high dimensional change point detection and inference for Bayesian linear models. In the first project, we propose a change point detection method testing mean shift for high dimensional observations with unknown heteroscedasticity. The proposed tests target a dense alternative and a wild bootstrap procedure is used to estimate the unknown limiting distribution. The bootstrap test is free of tuning parameters and we derive bootstrap consistency under the null. We extend the theory results to testing multiple change points and provide the justification for the size and power. For estimation of unknown change point locations, we utilize the wild binary segmentation algorithm. Empirical studies show that our methods have the correct size and better power compared with the existing approach when heteroscedasticity exists. In the second project, we propose a class of monitoring statistics for a mean shift in a sequence of high-dimensional observations. Inspired by the recent U-statistic based retrospective tests, we advance the U-statistic based approach to the sequential monitoring problem by developing a new adaptive monitoring procedure that can detect both dense and sparse changes in real-time. Unlike existing work based on self-normalization, we introduce a class of estimators for $q$-norm of the covariance matrix and prove their ratio consistency. To facilitate fast computation, we further develop recursive algorithms to improve the computational efficiency of the monitoring procedure. The advantage of the proposed methodology is demonstrated via simulation studies and real data illustrations. In the third project, we propose the use of a score-based working likelihood function for quantile regression which can perform inference for multiple conditional quantiles of an arbitrary number. We show that the proposed likelihood can be used in a Bayesian framework leading to valid frequentist inference, whereas the commonly used asymmetric Laplace working likelihood leads to invalid interval estimations and requires further correction. For computation, we propose a novel adaptive importance sampling algorithm to compute important posterior summaries such as the posterior mean and the covariance matrix. Our proposed approach makes it feasible to perform valid inference for parameters such as the slope differences at different quantile levels, which is either not possible or cumbersome using existing Bayesian approaches. Empirical results demonstrate that the proposed likelihood has good estimation and inferential properties and that the proposed computational algorithm is more efficient than its competitors. In the fourth project, we propose a new Bayesian method to perform valid inference for low dimensional parameters in high dimensional linear models under sparsity constraints. The idea is to use quasi Bayesian posteriors based on partial regression models to remove the effect of high dimensional nuisance variables and generate posterior samples of parameters for valid uncertainty quantification. We name the final distribution we use to conduct inference ``conditional Bayesian posterior'' as it is constructed conditional on quasi posterior distributions of other parameters and does not admit a fully Bayesian interpretation. Unlike existing Bayesian regularization methods, our method can be used to quantify the estimation uncertainty for arbitrarily small signals and therefore does not require variable selection consistency to guarantee its validity. Theoretically, we show that the resulting Bayesian credible intervals achieve desired coverage probabilities in the frequentist sense. Methodologically, our proposed Bayesian framework can easily incorporate popular Bayesian regularization procedures such as those based on spike and slab priors and horseshoe priors to facilitate high accuracy estimation and inference. Numerically, our proposed method is demonstrated to have competitive empirical performance based on extensive simulation studies and a real data analysis.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110718
- Copyright and License Information
- Copyright 2021 Teng Wu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…