Withdraw
Loading…
Hypothesis testing and learning with small samples
Huang, Dayu
Loading…
Permalink
https://hdl.handle.net/2142/42475
Description
- Title
- Hypothesis testing and learning with small samples
- Author(s)
- Huang, Dayu
- Issue Date
- 2013-02-03T19:47:07Z
- Director of Research (if dissertation) or Advisor (if thesis)
- Meyn, Sean P.
- Doctoral Committee Chair(s)
- Meyn, Sean P.
- Committee Member(s)
- Blahut, Richard E.
- Milenkovic, Olgica
- Veeravalli, Venugopal V.
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Hypothesis Testing
- Large Deviations
- Classification
- Large Alphabet
- Feature Extraction
- Abstract
- Statistical hypothesis testing is a method to make a decision among two or more hypotheses using measurement data. It includes, for instance, deciding whether a system is in its normal state based on sensor measurements, or whether a person is healthy using data from medical tests. We are interested in the situation where the amount of measurement data available is sometimes limited, and the statistical models under the hypotheses have significant uncertainties: for example, a system could have many different abnormal states. The goal of this thesis is to develop appropriate analysis methods for hypothesis testing problems with a small number of observations and uncertainties regarding the hypotheses. We focus on two problems: a universal hypothesis testing problem and a binary classification problem. In the first problem, only one of the hypotheses has a clearly specified statistical model. In the second problem, the statistical model under either hypothesis is only partially known and training data are available to help learn the model. For both problems, existing analysis using large deviations has been shown to be a useful tool that leads to asymptotically optimal tests. However, the classical error exponent criterion that forms the foundation of this theory is not applicable for problems where the number of observations is relatively small compared to the number of possible outcomes in each observation (or the size of the observation alphabet). We introduce a new performance criterion based on large deviations analysis that generalizes the classical error exponent. The generalized error exponent characterizes how the probability of error depends on the number of observations and the observation alphabet size. It leads to optimal or near-optimal tests and new insights on some existing tests. The generalized error exponent analysis, as well as the classical CLT and error exponent analysis, reveals how the size of the alphabet, or more generally the number of features, affects a test's performance. Results from these analyses suggest that quantizing the observation or selecting a subset of features could help improve a test. We develop an optimization-based algorithm that learns the appropriate features from training data.
- Graduation Semester
- 2012-12
- Permalink
- http://hdl.handle.net/2142/42475
- Copyright and License Information
- Copyright 2012 Dayu Huang
Owning Collections
Dissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringGraduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…