Withdraw
Loading…
Data-efficient approaches for audio classification and separation
Wang, Zhepei
Loading…
Permalink
https://hdl.handle.net/2142/121935
Description
- Title
- Data-efficient approaches for audio classification and separation
- Author(s)
- Wang, Zhepei
- Issue Date
- 2023-08-22
- Doctoral Committee Chair(s)
- Smaragdis, Paris
- Committee Member(s)
- Lazebnik, Svetlana
- Hasegawa-Johnson, Mark
- Kim, Minje
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- deep learning
- sound classification
- source separation
- self-supervised learning
- semi-supervised learning
- continual learning
- Abstract
- Recent advances in deep learning for computational audio processing are established upon sufficient annotated audio data. However, obtaining a substantial volume of high-quality annotations from in-the-wild audio remains a significant challenge. In this thesis, we propose and analyze data-efficient approaches for modeling audio signals to perform sound classification and separation. First, we present neural network architectures based on multi-dimensional unrolling of recurrent neural networks that allow the model to perform sound event detection with efficient usage of training data. Equipped with adaptive computation, the model further learns to intelligently adjust the amount of computation and enables processing when only partial information is available. Next, we propose approaches for recognizing sound classes under a time-varying distribution. We investigate continual learning techniques to train a classifier that can efficiently learn new sound classes without forgetting the past using generative replay. We further extend our approach to an unsupervised learning setup, where the model progressively learns representations for an indefinite number of sound classes with few labels presented. Last but not least, we investigate learning with limited annotated data using semi-supervised learning. We demonstrate the effectiveness of the proposed teacher-student framework on tasks including cross-modal audio-text representation learning, singing voice separation, and personalized speech enhancement. To this end, our proposed data-efficient algorithms for audio classification and source separation show high potential for reducing the labor cost for collecting high-quality annotated data, improving computational and storage efficiency, and enabling processing on memory-limited edge devices.
- Graduation Semester
- 2023-12
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2023 Zhepei Wang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…