Withdraw
Loading…
Breaking down barriers: advancing interdisciplinary speech applications in early children’s development
Li, Jialu
Loading…
Permalink
https://hdl.handle.net/2142/124412
Description
- Title
- Breaking down barriers: advancing interdisciplinary speech applications in early children’s development
- Author(s)
- Li, Jialu
- Issue Date
- 2024-04-26
- Director of Research (if dissertation) or Advisor (if thesis)
- Hasegawa-Johnson, Mark
- Doctoral Committee Chair(s)
- Hasegawa-Johnson, Mark
- Committee Member(s)
- McElwain, Nancy L
- Bhat, Suma
- Varshney, Lav R
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- child vocalizations classifications
- child-adult speaker diarization
- autism diagnosis
- family audio analysis
- self-supervised learning
- transfer learning
- interdisciplinary speech applications
- Abstract
- This thesis aims to develop interdisciplinary speech applications using machine learning algorithms to identify children with developmental disorders or speech and language delays early. Specifically, we build machine learning models to capture critical adult-child interactions under different social contexts, including turn-taking vocalizations between parents and infants (under 14 months old) at home or joint attention between clinicians and toddlers (1-2 years old) at clinics. Turn-taking vocalizations are considered as coordinated interactions; no response and co-vocalizations are considered as uncoordinated interactions. Previous research has shown that daily repeated and reinforced uncoordinated interactions may contribute to mental health problems in children in the long run. In autism screening, detecting whether clinicians and children establish joint attention during semi-structured assessments is crucial, as this is considered a key factor for diagnosing autism. To achieve this goal, we focused on two speech-processing tasks: speaker diarization (identify who spoke when) and vocalization classifications (identify the type of vocalization given a speaker). Because annotating audio is a labor- and time-consuming task, the thesis addresses the technical difficulties in improving the performance of speech-processing models given a limited amount of labeled audio. We explore several transfer learning techniques within supervised learning as well as leverage self-supervised learning for enhancing child audio analysis tasks. With the self-supervised learning scheme, we show that the performance of proposed interdisciplinary speech applications achieved significant advancement in child audio analysis tasks. This thesis expands the application of traditional speech technology like speech-to-text and text-to-speech, exploring its potential in other disciplines such as psychology and healthcare.
- Graduation Semester
- 2024-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2024 Jialu Li
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…