Automatic Lipreading to Enhance Speech Recognition (Speech Reading)
Petajan, Eric David
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/69292
Description
Title
Automatic Lipreading to Enhance Speech Recognition (Speech Reading)
Author(s)
Petajan, Eric David
Issue Date
1984
Department of Study
Electrical Engineering
Discipline
Electrical Engineering
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Engineering, Electronics and Electrical
Abstract
Automatic recognition of the acoustic speech signal alone is inaccurate and computationally expensive. Additional sources of speech information, such as lipreading (or speechreading), should enhance automatic speech recognition, just as lipreading is used by humans to enhance speech recognition when the acoustic signal is degraded. This paper describes an automatic lipreading system which has been developed. A commercial device performs the acoustic speech recognition independently of the lipreading system.
The recognition domain is restricted to isolated utterances and speaker dependent recognition. The speaker faces a solid state camera which sends digitized video to a minicomputer system with custom video processing hardware. The video data is sampled during an utterance and then reduced to a template consisting of visual speech parameter time sequences. The distances between the incoming template and all of the trained templates for each utterance in the vocabulary are computed and a visual recognition candidate is obtained. The combination of the acoustic and visual recognition candidates is shown to yield a final recognition accuracy which greatly exceeds the acoustic recognition accuracy alone. Practical considerations and the possible enhancement of speaker independent and continuous speech recognition systems are also discussed.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.