Speech synthesis using Mel-Cepstral coefficient feature
Wang, Lu
Loading…
Permalink
https://hdl.handle.net/2142/100043
Description
Title
Speech synthesis using Mel-Cepstral coefficient feature
Author(s)
Wang, Lu
Contributor(s)
Hasegawa-Johnson, Mark
Issue Date
2018-05
Keyword(s)
Speech Synthesis
Cepstrum Analysis
Abstract
This thesis presents a method to improve quality of synthesized speech by reducing the vocoded effect. The synthesis model takes mel-cepstral coefficients and spectrum envelopes as features of the original speech waveform. Mel-cepstral coefficients could be used to generate natural sounding voice and reduce the artificial effect. Compared to regular linear predictive coding (LPC) coefficient which is also widely used in speech synthesis, the mel-cepstral coefficient could resemble the human voice more closely by providing the synthesized speech with more details in the low frequency band. The model uses a synthesis filter to estimate the log spectrum including both zeros and poles in the transfer function, along with the mixed excitation technique which could divide speech signals into multiple frequency bands to better approximate natural speech production.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.