Machine learning approaches to improving mispronunciation detection on an imbalanced corpus
Yang, Xuesong
Loading…
Permalink
https://hdl.handle.net/2142/89050
Description
Title
Machine learning approaches to improving mispronunciation detection on an imbalanced corpus
Author(s)
Yang, Xuesong
Issue Date
2015-12-07
Director of Research (if dissertation) or Advisor (if thesis)
Hasegawa-Johnson, Mark A.
Department of Study
Electrical & Computer Engineering
Discipline
Electrical & Computer Engineering
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Imbalanced Learning
Sampling Methods
Pronunciation Error Detection
Spoken Language Assessment
Computer Assisted Language Learning
Abstract
This thesis reports the investigations into the task of phone-level pronunciation error detection, the performance of which is heavily affected by the imbalanced distribution of the classes in a manually annotated data set of non-native English (Read Aloud responses from the TOEFL Junior Pilot assessment). In order to address problems caused by this extreme class imbalance, two machine learning approaches, cost-sensitive learning and over-sampling, are explored to improve the classification performance. Specifically, approaches which assigned weights inversely proportional to class frequencies and synthetic minority over-sampling technique (SMOTE) were applied to a range of classifiers using feature sets that included information about the acoustic signal, the linguistic properties of the utterance, and word identity. Empirical experiments demonstrate that both balancing approaches lead to a substantial performance improvement (in terms of f1 score) over the baseline on this extremely imbalanced data set. In addition, this thesis also discusses which features are the most important and which classifiers are most effective for the task of identifying phone-level pronunciation errors in non-native speech.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.