Withdraw
Loading…
Acoustic event, spoken keyword and emotional outburst detection
Xu, Yijia
Content Files

Loading…
Download Files
Loading…
Download Counts (All Files)
Loading…
Edit File
Loading…
Permalink
https://hdl.handle.net/2142/105158
Description
- Title
- Acoustic event, spoken keyword and emotional outburst detection
- Author(s)
- Xu, Yijia
- Issue Date
- 2019-04-03
- Director of Research (if dissertation) or Advisor (if thesis)
- Hasegawa-Johnson, Mark Allen
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Date of Ingest
- 2019-08-23T20:44:38Z
- Keyword(s)
- audio event detection
- spoken keyword detection
- emotion detection
- speech recognition
- convolutional neural network
- hidden Markov model
- phonetic keywork spotter
- Abstract
- This thesis presents work in research topics of audio detection. It first describes a system for large-scale multi-label acoustic event detection (AED) in YouTube videos. It explores the potential of the state-of-the-art deep learning classifiers for AED, describes both qualitative and quantitative results (Hit@1 is 47.9%) and presents the pre-trained embedding model as a powerful feature extractor to be adapted to new domains with limited data and improve the detection accuracy (Hit@1 is 58.1%). Second, the thesis focuses on the speech acoustic events and the spoken keyword spotting task for speech. It presents a phonetic keyword spotter as a lightweight alternative to full speech recognition (3x faster, with comparable detection rates and that addresses automatic speech recognition problems). It also explores cross-lingual keyword spotting to support low resource languages and finds that the acoustic model is dominant in determining the cross-lingual keyword search performance. Third, the thesis further presents the emotional outburst detection for infant nonspeech acoustic events. It reports on the efforts to manually code child utterances as being of type “laugh,” “cry,” “fuss,” “babble,” and “hiccup” and to develop the algorithms capable of performing the same task automatically.
- Graduation Semester
- 2019-05
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/105158
- Copyright and License Information
- Copyright 2019 Yijia Xu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…