Withdraw
Loading…
Towards the processing of semantic idiomaticity in natural language
Zeng, Ziheng
This item's files can only be accessed by the Administrator group.
Permalink
https://hdl.handle.net/2142/120501
Description
- Title
- Towards the processing of semantic idiomaticity in natural language
- Author(s)
- Zeng, Ziheng
- Issue Date
- 2023-03-24
- Director of Research (if dissertation) or Advisor (if thesis)
- Bhat, Suma
- Doctoral Committee Chair(s)
- Bhat, Suma
- Committee Member(s)
- Varshney, Lav
- Hasegawa-Johnson, Mark
- Viswanath, Pramod
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Natural language processing
- idiomatic expression
- non-compositional language
- idiom identification
- paraphrase generation
- representation learning
- commonsense knowledge graph
- Abstract
- Pre-trained language models (PTLMs) are the cornerstone of recent breakthroughs in natural language processing (NLP). They enable machines to comprehend, analyze, and utilize human language in a multitude of applications. Despite their impressive abilities, PTLMs face challenges when dealing with idiomatic expressions (IEs), a common form of figurative language. IEs exhibit semantic non-compositionality, meaning that their figurative meaning cannot be derived from their constituent words, and contextual ambiguity, meaning that they can be interpreted literally depending on the context. This dissertation aims to explore the challenges posed by IEs to PTLMs and to advance the development of more effective NLP systems capable of processing semantic idiomaticity both explicitly and implicitly. To address the problem explicitly, we propose methods to detect and remove IEs from natural sentences and thus directly reduce the influences from IEs. Specifically, we investigate an IE identification approach that detects and localizes expressions that exhibit non-compositionality and an IE paraphrasing method that replaces the expressions with their literal counterpart phrases. To address the issue implicitly, we tackle the more fundamental issue of IE comprehension by enabling effective IE representation and introducing commonsense knowledge to PTLMs. We study methods to enhance PTLMs' ability to produce semantically meaningful and contextually appropriate embeddings for IEs. Additionally, we create an IE-centered commonsense knowledge graph and train PTLMs to become commonsense knowledge models capable of understanding and generating inferential knowledge on IE uses. Our research shows that with enhanced innate IE comprehension ability, PTLMs' performance on IE processing tasks and IE-related language understanding tasks would improve naturally. By pursuing both explicit and implicit approaches to processing semantic idiomaticity, our research has accomplished the overarching goal of extending the foundational capacity of PTLMs to perform natural language understanding in the presence of IEs.
- Graduation Semester
- 2023-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2023 Ziheng Zeng
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…