Withdraw
Loading…
SPEECH INTERRUPTION DETECTION FOR LIVE-STREAMING AUDIO
Jin, Xin
Loading…
Permalink
https://hdl.handle.net/2142/124948
Description
- Title
- SPEECH INTERRUPTION DETECTION FOR LIVE-STREAMING AUDIO
- Author(s)
- Jin, Xin
- Issue Date
- 2021-05-01
- Keyword(s)
- speech interruption detection; live streaming; support vector machine; k-nearest neighbor; multilayer perceptron; mean opinion score.
- Abstract
- Conversation is an important human activity. It happens between multiple persons when they start and end talking naturally. However, an interruption may occur when one speaker speaks over another speaker either intentionally or unintentionally. Frequent interruptions during conversation can significantly influence the experience and vastly decrease the efficiency of the conversation. Interruptions can happen more frequently in live-streamed audio calls with significant internet delays. Detection of interruption during conversation can be helpful for live-streaming companies who care about their quality of service. It can also be used for speech-to-text models for audio preprocessing and labeling and estimating conflict level in a debate. This project aims to assess the quality of the interrupted speech in live-streaming audios. The task of interruption detection was divided into two steps: generation of the simulated interrupted speech audio dataset and building machine learning models for interruption detection. The audio dataset was synthetically created by concatenating and overlapping speech audios with different interruption times and latency times. The performance of interruption detection was examined on the k-nearest neighbor classifier, the support vector machine classifier, and the multilayer perceptron model. Each model takes an array of the 0.5s audio segment as input and then predicts the existence of interrupted speech in each 0.5s segment. The result has shown that the SVM model appears to be very effective at detecting interrupted speeches in the audio of a conversation. It has an accuracy of 92.61% on cross-validation of training data and 72.62% on unseen data.
- Type of Resource
- text
- Language
- eng
Owning Collections
Senior Theses - Electrical and Computer Engineering PRIMARY
The best of ECE undergraduate researchManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…