Withdraw
Loading…
Longitudinal principal components analysis for binary and continuous data
Kinson, Christopher Leron
Loading…
Permalink
https://hdl.handle.net/2142/98374
Description
- Title
- Longitudinal principal components analysis for binary and continuous data
- Author(s)
- Kinson, Christopher Leron
- Issue Date
- 2017-07-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Qu, Annie
- Doctoral Committee Chair(s)
- Qu, Annie
- Committee Member(s)
- Culpepper, Steven
- Marden, John
- Simpson, Douglas
- Department of Study
- Statistics
- Discipline
- Statistics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Time-varying
- Longitudinal
- Eigen-decomposition
- Non-parametric spline
- Odds ratio
- Abstract
- Large-scale data or big data is an enormously popular word in the data science and statistics communities. These datasets are often collected over periods of time - at hourly and weekly rates - with the help of technological advancements in physical and cloud-based storage. The information stored is useful, especially in biomedicine, insurance, and retail, where patients and customers are crucial to business survival. In this thesis, we develop new statistical methodologies for handling two types of datasets: continuous data and binary data. Time-varying associations among store products provide important information to capture changes in consumer shopping behavior. In the first part of this thesis, we propose a longitudinal principal component analysis (LPCA) using a random-effects eigen-decomposition, where the eigen-decomposition utilizes longitudinal information over time to model time-varying eigenvalues and eigenvectors of the corresponding covariance matrices. Our method can effectively analyze large marketing data containing sales information for selected consumer products from hundreds of stores over an 11-year time period. The proposed method leads to more accurate estimation and interpretation compared to comparable approaches, which is illustrated through finite sample simulations. We show our method's capabilities and provide an interpretation of the eigenvector estimates in an application to IRI marketing data. In the second part of this thesis, we formulate the LPCA problem for binary data. We propose capturing the associations among the products or variables through the odds ratios, where a two by two contingency table contains probabilities representing the joint distribution of two binary products. The eigen-decomposition utilizes longitudinal information over time to model time-varying eigenvalues and eigenvectors of the corresponding odds ratio matrices. These odds ratio matrices measure the pairwise associations among the binary products and is more appropriate to use than the Pearson correlation coefficient. Our method illustrates an improvement in visualization and interpretation through simulation studies and an application to IRI panel data of individual customer purchases.
- Graduation Semester
- 2017-08
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/98374
- Copyright and License Information
- Copyright 2017 Christopher Kinson
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…