Withdraw
Loading…
English complex verb constructions: identification and inference
Tu, Yuancheng
Loading…
Permalink
https://hdl.handle.net/2142/34378
Description
- Title
- English complex verb constructions: identification and inference
- Author(s)
- Tu, Yuancheng
- Issue Date
- 2012-09-18T21:14:08Z
- Director of Research (if dissertation) or Advisor (if thesis)
- Roth, Dan
- Doctoral Committee Chair(s)
- Shih, Chilin
- Committee Member(s)
- Roth, Dan
- Girju, Roxana
- Hockenmaier, Julia C.
- Department of Study
- Linguistics
- Discipline
- Linguistics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- computational lexical semantics
- Multiword Expressions
- Complex Verb Predicates
- Light Verb Constructions
- Phrasal Verb Constructions
- textual entailment
- factive and implicative verbs
- natural language processing
- supervised machine learning
- Abstract
- The fundamental problem faced by automatic text understanding in Natural Language Processing (NLP) is to identify semantically related pieces of text and integrate them together to compute the meaning of the whole text. However, the principle of compositionality runs into trouble very quickly when real language is examined with its frequent appearance of Multiword Expressions (MWEs) whose meaning is not based on the meaning of their parts. MWEs occur in all text genres and are far more frequent and productive than are generally recognized, and pose serious difficulties for every kind of NLP applications. Given these diverse kinds of MWEs, this dissertation focuses on English verb related MWEs, constructs stochastic models to identify these complex verb predicates within the given context and discusses empirically the significance of this MWE recognition component in the context of Textual Entailment (TE), an intricate semantic inference task that involves various levels of linguistic knowledge and logic reasoning. This dissertation develops high quality computational models for three of the most frequent kinds of English complex verb constructions: Light Verb Construction (LVC), Phrasal Verb Construction (PVC) and Embedded Verb Construction (EVC), and demonstrates empirically their usage in textual entailment. The discriminative model for LVC identification achieves an 86.3% accuracy when trained with groups of either contextual and statistical features. For PVC identification, the learning model reaches 79.4% accuracy, a 41.1% error reduction compared to the baseline. In addition, adding the LVC classifier helps the simple but robust lexical TE system achieve a 39.5% error reduction in accuracy and a 21.6% absolute F1 value improvement. Similar improvements are achieved by adding the PVC and EVC classifiers into this entailment system with a 30.6% and 39.4% absolute accuracy improvement respectively. In this dissertation, two types of automation are achieved with respect to English complex verb predicates: learning to recognize these MWEs within a given context and discovering the significance of this identification within an empirical semantic NLP application, i.e., textual entailment. The lack of benchmark datasets with respect to these special linguistic phenomena is the main bottleneck to advance the computational research in them. The study presented in this dissertation provides two benchmark datasets related to the identification of LVCs and PVCs respectively and three linguistic phenomenon specified TE datasets to automate the investigation of the significance of these linguistic phenomena within a TE system. These datasets enable us to make a direct evaluation and comparison of lexically based models, reveal insightful differences between them, and create a simple but robust improved model combination. In the long run, we believe that the availability of these datasets will facilitate improved models that consider the various special multiword related phenomena within the complex semantic systems, as well as applying supervised machine learning models to optimize model combination and performance.
- Graduation Semester
- 2012-08
- Permalink
- http://hdl.handle.net/2142/34378
- Copyright and License Information
- Copyright 2012 Yuancheng Tu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…