Withdraw
Loading…
Consistent and efficient long document understanding
Zeng, Qi
Loading…
Permalink
https://hdl.handle.net/2142/121969
Description
- Title
- Consistent and efficient long document understanding
- Author(s)
- Zeng, Qi
- Issue Date
- 2023-11-03
- Director of Research (if dissertation) or Advisor (if thesis)
- Ji, Heng
- Doctoral Committee Chair(s)
- Ji, Heng
- Committee Member(s)
- Tong, Hanghang
- Zhao, Han
- Wang, Lu
- Li, Lei
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Natural Language Processing
- Abstract
- In the age of information overload, people's information needs from long documents are rapidly emerging, while people's patience for careful reading and reasoning is gradually vanishing. While people are inundated with large amounts of long textual documents covering topics in various domains, such as news, healthcare, legal service, and finance, they struggle to gain quick, concise, and accurate insights from these long and tedious documents. The development of automatic document understanding systems promises the possibility of assisting humans in gaining insights from those long documents. Automatic systems capture and analyze the information contained in a collection of news and scientific reports in a concise and machine-understandable way. Automatic systems parse unstructured text by identifying the relations between events and entities from long complex reading for structured data usage. Automatic systems provide reliable digests by factually and consistently summarizing recent papers, reports, news, and reviews. However, automatically understanding long documents remains a challenge because recent state-of-the-art document understanding systems are mostly built upon transformer structures and are mostly motivated, designed, implemented, and evaluated under the short-input setting. To adapt those short-input systems to long sequences, documents have to be truncated, chunked using a sliding window, or processed in parallel on multiple machines. These additional operations usually cause the loss of long-range interdependency and introduce additional costs. Therefore, this thesis focuses on developing principled and scalable methods for more consistent and efficient long document understanding. In particular, we investigate four research problems from the perspectives of consistency and efficiency: 1) Consistent Meta-review Generation. Current work on Opinion Summarization extracts and selects representing opinions on aspects of interest under the assumption that input opinions are non-controversial. Opinions in the scientific domain can be divergent, leading to controversy or consensus among reviewers, while the scientific meta-review should be consistent with the synthesized opinions from individual reviews. Therefore, we propose to benchmark scientific opinion summarization by collecting paper meta-reviews from OpenReview, proposing a Checklist-guided Iterative Introspection approach, and constructing a comprehensive evaluation framework. 2) Consistent Document Summarization. Current abstractive summarization models often generate inconsistent content, i.e. texts that are not directly inferable from the source document, are not consistent with respect to world knowledge, or are self-contradictory. To improve the general consistency we introduce EnergySum, where we apply the Residual Energy-based Model by designing energy scorers that reflect each type of consistency and incorporating them into the sampling process. 3) Consistent Document-level Event Argument Extraction. Recent work on document-level event argument extraction models each individual event in isolation and therefore causes inconsistency among extracted arguments across events, which will further cause discrepancies for downstream applications. To address this problem, we formulate event argument consistency as the constraints from event-event relations under the document-level setting and further introduce the Event-Aware Argument Extraction (EA$^2$E) model with augmented context for training and inference. 4) Efficient Document Processing. Transformer-based models are inefficient in processing long sequences due to the quadratic space and time complexity in the self-attention modules. To address this limitation, we introduce two methods for self-attention acceleration, a modified Nystr\"om method (Skyformer) to accelerate kernelized attention and stabilize training and a Sketching-based method (Skeinformer) that applies sub-sampling sketching.
- Graduation Semester
- 2023-12
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2023 Qi Zeng
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…