Representation learning of natural language and its application to language understanding and generation

Gong, Hongyu

Representation learning of natural language and its application to language understanding and generation

Gong, Hongyu

Permalink

https://hdl.handle.net/2142/108110

Description

Title

Representation learning of natural language and its application to language understanding and generation

Author(s)

Gong, Hongyu

Issue Date

2020-04-15

Director of Research (if dissertation) or Advisor (if thesis)

Bhat, Suma

Doctoral Committee Chair(s)

Bhat, Suma

Committee Member(s)

Viswanath, Pramod
Srikant, Rayadurgam
Hwu, Wen-mei
Fanti, Giulia

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Date of Ingest

2020-08-26T23:54:33Z

Keyword(s)

Natural Language Processing
Representation Learning
Language Understanding
Language Generation

Abstract

How to properly represent language is a crucial and fundamental problem in Natural Language Processing (NLP). Language representation learning aims to encode rich information such as the syntax and semantics of the language into dense vectors. It facilitates the modeling, manipulation and analysis of natural language in computational linguistics. Existing algorithms utilize corpus statistics such as word co-occurrences to learn general-purpose language representation. Recent advances in generic representation integrate intensive information such as contextualized features from unlabeled text corpora. In this dissertation, we continue this line of research to incorporate rich knowledge into generic embeddings. We show that word representation could be enriched with various information including temporal and spatial variations as well as syntactic functionalities, and that text representation could be refined with topical knowledge. Moreover, we develop an insight into the geometry of pre-trained representation, and connect it to the semantic understanding such as identifying the idiomatic word usage. Besides generic representation, task-dependent representation is also extensively studied in downstream applications, where the representation is trained to encode domain information from labeled datasets. This dissertation leverages the capability of neural network models to integrate the task-specific supervision into language representations. We introduce new deep learning models and algorithms to train representations with external knowledge in annotated data. It is shown that the learned representation can assist in various downstream tasks in language understanding such as text classification and language generation such as text style transfer.

Graduation Semester

2020-05

Type of Resource

Thesis

Permalink

http://hdl.handle.net/2142/108110

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Electrical and Computer Engineering

Dissertations and Theses in Electrical and Computer Engineering

Representation learning of natural language and its application to language understanding and generation

Gong, Hongyu

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In