Knowledge acquisition for natural language understanding
Lai, Tuan Manh
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/121297
Description
Title
Knowledge acquisition for natural language understanding
Author(s)
Lai, Tuan Manh
Issue Date
2023-05-24
Director of Research (if dissertation) or Advisor (if thesis)
Ji, Heng
Doctoral Committee Chair(s)
Ji, Heng
Committee Member(s)
Zhai, ChengXiang
Han, Jiawei
Bui, Trung H
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Natural Language Processing
Information Extraction
Deep Learning
Large Language Models
Abstract
Large neural models pretrained on vast volumes of text have achieved remarkable success in various natural language processing tasks. However, these models may still face challenges in knowledge-intensive tasks due to their training methods, which typically focus on learning directly from raw texts and do not incorporate existing linguistic resources or structured domain knowledge. This thesis aims to develop methods to effectively incorporate external knowledge into existing neural models to enhance their performance.
We propose three novel approaches that offer varying levels of explicitness for incorporating external knowledge into neural models, accommodating a wide range of use cases and providing great flexibility.
The first approach involves incorporating various types of domain knowledge from multiple sources into language models using lightweight adapter modules. For each knowledge source of interest, we train an adapter module to capture the knowledge in a self-supervised way. The knowledge encoded in the adapters can then be combined for downstream tasks using fusion layers. This approach provides an easy-to-use, implicit way of incorporating external knowledge.
The second approach involves utilizing a retrieval system to retrieve relevant passages from a knowledge base, which can then be used to enhance an output generation model. To train the retrieval components, we use a novel method for generating pseudo-labels to avoid the need for collecting costly gold-standard retrieval labels. This approach offers a more explicit way of accessing and using external knowledge than the adapter-based approach and provides greater interpretability. Additionally, new knowledge can typically be added to the knowledge base without updating any parameters of the neural component.
The third approach involves using entity linking to extract the exact part of a knowledge graph that is relevant to the task at hand. We then utilize graph neural networks to incorporate the extracted subgraph into the existing neural model. This approach provides an even more explicit way of incorporating external knowledge, allowing for fine-grained control over what knowledge to incorporate and offering even more interpretability.
We demonstrate the effectiveness of our proposed methods on various knowledge-intensive natural language processing tasks, including biomedical information extraction and knowledge-grounded dialog. We show that incorporating external knowledge can help overcome the difficulty of learning domain-specific knowledge and enhance the model's efficiency and interpretability. Our methods also allow for natural updates and additions of external knowledge, providing a flexible and scalable way of enhancing large neural language models. Overall, our methods achieve state-of-the-art results on many benchmarks.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.