Incorporating Knowledge Resources into Natural Language Processing Techniques to Advance Academic Research and Application Development
Han, Kanyao
Loading…
Permalink
https://hdl.handle.net/2142/118097
Description
Title
Incorporating Knowledge Resources into Natural Language Processing Techniques to Advance Academic Research and Application Development
Author(s)
Han, Kanyao
Contributor(s)
Han, Kanyao
Issue Date
2023-05
Keyword(s)
Dissertation Proposal
Abstract
The rapid advancement of natural language processing (NLP) and machine learning (ML) techniques, coupled with the accumulation of data and knowledge resources in the recent decades, opens up numerous new opportunities for social and scientific studies, as well as for developing applications used in daily life (e.g., chatbots and online search engines). However, challenges persist, such as the lack of sufficient amounts of annotated training data to build or fine-tune NLP and ML models, noisy data with incomplete information for specific needs, and the adaptation of generic pre-trained models to domain-specific downstream tasks, among others.
Leveraging knowledge resources, which I define as data or human resources that contain dense and typically structured knowledge within specific domains, holds promise for advancing NLP and ML techniques to facilitate social and scientific studies, as well as application design and development for daily life purposes. In this dissertation, I investigate various knowledge resources that can be mined and incorporated into NLP techniques for social and scientific studies and application development. Specifically, this dissertation will present four studies, including trimming the Wikipedia Category Tree for domain-specific tasks, disambiguating funder names and predicting funder characteristics for funding allocation studies based on community-curated resources, developing socially responsible chatbots for purchase decision-making based on online platform data, and categorizing domain-specific documents based on small annotated data and an expert-in-the-loop approach.
These studies make contributions to advance 1) knowledge on how to use existing knowledge resources for specific domains or tasks, 2) novel frameworks for cleaning, mining, and utilizing these knowledge resources, and 3) models and systems that can be directly used for tasks such as funder name disambiguation and question-answering.
This is the default collection for all research and scholarship developed by faculty, staff, or students at the University of Illinois at Urbana-Champaign
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.