Withdraw
Loading…
Unsupervised grammar induction with Combinatory Categorial Grammars
Bisk, Yonatan Yitzhak
Loading…
Permalink
https://hdl.handle.net/2142/89027
Description
- Title
- Unsupervised grammar induction with Combinatory Categorial Grammars
- Author(s)
- Bisk, Yonatan Yitzhak
- Issue Date
- 2015-11-30
- Director of Research (if dissertation) or Advisor (if thesis)
- Hockenmaier, Julia
- Doctoral Committee Chair(s)
- Hockenmaier, Julia
- Committee Member(s)
- Eisner, Jason
- Roth, Dan
- Zhai, ChengXiang
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Combinatory Categorial Grammar (CCG)
- Grammar Induction
- Unsupervised Methods
- Abstract
- Language is a highly structured medium for communication. An idea starts in the speaker's mind (semantics) and is transformed into a well formed, intelligible, sentence via the specific syntactic rules of a language. We aim to discover the fingerprints of this process in the choice and location of words used in the final utterance. What is unclear is how much of this latent process can be discovered from the linguistic signal alone and how much requires shared non-linguistic context, knowledge, or cues. Unsupervised grammar induction is the task of analyzing strings in a language to discover the latent syntactic structure of the language without access to labeled training data. Successes in unsupervised grammar induction shed light on the amount of syntactic structure that is discoverable from raw or part-of-speech tagged text. In this thesis, we present a state-of-the-art grammar induction system based on Combinatory Categorial Grammars. Our choice of syntactic formalism enables the first labeled evaluation of an unsupervised system. This allows us to perform an in-depth analysis of the system’s linguistic strengths and weaknesses. In order to completely eliminate reliance on any supervised systems, we also examine how performance is affected when we use induced word clusters instead of gold-standard POS tags. Finally, we perform a semantic evaluation of induced grammars, providing unique insights into future directions for unsupervised grammar induction systems.
- Graduation Semester
- 2015-12
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/89027
- Copyright and License Information
- Copyright 2015 Yonatan Bisk
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…