Withdraw
Loading…
A theory of (almost) zero resource speech recognition
Bharadwaj, Sujeeth Subramanya
Loading…
Permalink
https://hdl.handle.net/2142/78343
Description
- Title
- A theory of (almost) zero resource speech recognition
- Author(s)
- Bharadwaj, Sujeeth Subramanya
- Issue Date
- 2015-03-31
- Director of Research (if dissertation) or Advisor (if thesis)
- Hasegawa-Johnson, Mark A.
- Doctoral Committee Chair(s)
- Hasegawa-Johnson, Mark A.
- Committee Member(s)
- Levinson, Stephen E.
- Liang, Feng
- Smaragdis, Paris
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Date of Ingest
- 2015-07-22T22:16:25Z
- Keyword(s)
- Speech recognition
- Unsupervised learning
- PAC-Bayesian theory
- Language Modeling
- Acoustic Event Detection
- anomaly detection
- Abstract
- Automatic speech recognition has matured into a commercially successful technology, enabling voice-based interfaces for smartphones, smart TVs, and many other consumer devices. The overwhelming popularity, however, is still limited to languages such as English, Japanese, and German, where vast amounts of labeled training data are available. For most other languages, it is prohibitively expensive to 1) collect and transcribe the speech data required to learn good acoustic models; and 2) acquire adequate text to estimate meaningful language models. A theory of unsupervised and semi-supervised techniques for speech recognition is therefore essential. This thesis focuses on HMM-based sequence clustering and examines acoustic modeling, language modeling, and applications beyond the components of an ASR, such as anomaly detection, from the vantage point of PAC-Bayesian theory. The first part of this thesis extends standard PAC-Bayesian bounds to address the sequential nature of speech and language signals. A novel algorithm, based on sparsifying the cluster assignment probabilities with a Renyi entropy prior, is shown to provably minimize the generalization error of any probabilistic model (e.g. HMMs). The second part examines application-specific loss functions such as cluster purity and perplexity. Empirical results on a variety of tasks -- acoustic event detection, class-based language modeling, and unsupervised sequence anomaly detection -- confirm the practicality of the theory and algorithms developed in this thesis.
- Graduation Semester
- 2015-5
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/78343
- Copyright and License Information
- Copyright 2015 Sujeeth Bharadwaj
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…