Withdraw
Loading…
Creating A Disability Corpus for Literary Analysis: Pilot Classification Experiments
Dubnicek, Ryan; Underwood, Ted; Downie, J. Stephen
Loading…
Permalink
https://hdl.handle.net/2142/100252
Description
- Title
- Creating A Disability Corpus for Literary Analysis: Pilot Classification Experiments
- Author(s)
- Dubnicek, Ryan
- Underwood, Ted
- Downie, J. Stephen
- Issue Date
- 2018
- Keyword(s)
- distant reading
- digital humanities
- HathiTrust
- disability in literature
- Abstract
- As literary text opens to researchers for distant reading, the computational analysis of large corpora of text for literary scholarship, problems beyond typical data science roadblocks, such as data scale and statistical significance of findings have emerged. For scholars studying character and social representation in literature, the identification of characters within the given classes of study is crucial, painstaking, and often a manual process. However, for characters with disabilities, manual identification is prohibitively difficult to undertake at scale, and especially challenging given the coded textual markers that can be used to refer to disability. There currently exists no corpus of characters in fiction with disabilities, which is the first step to at-scale computational study of this topic. This project seeks to pilot a classification process using manually assigned ground truth on a subset of volumes from the HathiTrust. Having successfully built and evaluated a Naïve Bayes classifier, we suggest full-scale deployment of a statistical classifier on a large corpus of literature in order to assemble a disability corpus. This project also covers preliminary exploratory textual analysis of characters with disabilities to yield potential research questions for further exploration.
- Publisher
- iSchools
- Series/Report Name or Number
- iConference 2018 Proceedings
- Type of Resource
- text
- Language
- eng
- Permalink
- http://hdl.handle.net/2142/100252
- Copyright and License Information
- Copyright 2018 is held by Ryan Dubnicek, Ted Underwood, J. Stephen Downie. Copyright permissions, when appropriate, must be obtained directly from the authors.
Owning Collections
Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…