Withdraw
Loading…
Weakly supervised aspect extraction for domain-specific texts
Guo, Fang
Loading…
Permalink
https://hdl.handle.net/2142/108548
Description
- Title
- Weakly supervised aspect extraction for domain-specific texts
- Author(s)
- Guo, Fang
- Issue Date
- 2020-07-24
- Director of Research (if dissertation) or Advisor (if thesis)
- Han, Jiawei
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Aspect Extraction
- Weakly-supervised
- Abstract
- Aspect extraction, identifying aspects of text segments from a pre-defined set of aspects, is one of the keystones in text understanding. It benefits numerous applications, including sentiment analysis and product review summarization. Most existing aspect extraction methods heavily rely on human-curated aspect annotations of massive text segments, thus making them expensive to be applied in specific domains. Recent attempts leveraging clustering methods can alleviate such annotation effort, but they require domain-specific knowledge and effort to further filter, aggregate, and align the clustering results to desired aspects. Therefore, in this paper, we explore to extract aspects from the domain-specific raw texts with very limited supervision – only a few user-provided seed words per each aspect. Specifically, our proposed neural model is equipped with multi-head attention and self-training. The multi-head attention is learned from the seed words to ensure that the aspect-related words in text segments are weighted higher than those unrelated ones. The self-training mechanism provides more pseudo labels in addition to limited supervision. Extensive experiments on real-world datasets demonstrate the superior performance of our proposed framework, as well as the effectiveness of both the attention module and the self-training mechanism. Case studies on the attention weights further shed lights on the interpretability of our aspect extraction results.
- Graduation Semester
- 2020-08
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/108548
- Copyright and License Information
- Copyright 2020 Fang Guo
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…