Withdraw
Loading…
Proposal for Persistent & Unique Entity Identifiers
Jett, Jacob; Ruan, Guangchen; Unnikrishnan, Leena; Fallaw, Colleen; Maden, Christopher; Cole, Timothy
Loading…
Permalink
https://hdl.handle.net/2142/73147
Description
- Title
- Proposal for Persistent & Unique Entity Identifiers
- Author(s)
- Jett, Jacob
- Ruan, Guangchen
- Unnikrishnan, Leena
- Fallaw, Colleen
- Maden, Christopher
- Cole, Timothy
- Issue Date
- 2014-08-22
- Keyword(s)
- system architecture
- identifiers
- HathiTrust Research Center
- HathiTrust Digital Library
- digital libraries
- Abstract
- This proposal argues for the establishment of persistent and unique identifiers for page level content. The page is a key conceptual entity within the HathiTrust Research Center (HTRC) framework. Volumes are composed of pages and pages are the size of the portions of data that the HTRC’s analytics modules consume and execute algorithms across. The need for infrastructure that supports persistent and unique identity for is best described by seven use cases: 1. Persistent Citability: Scholars engaging in the analysis of HTRC resources have a clear need to cite those resources in a persistent manner independent of those resources’ relative positions within other entities. 2. Point-in-time Citability: Scholars engaging in the analysis of HTRC resources have a clear need to cite resources in an unambiguous way that is persistent with respect to time. 3. Reproducibility: Scholars need methods by which the resources that they cite can be shared so that their work conforms to the norms of peer-review and reproducibility of results. 4. Supporting “non-consumptive” Usage: Anonymizing page-level content by disassociating it from the volumes that it is conceptually a part of increases the difficulty of leveraging HTRC analytics modules for the direct reproduction of HathiTrust (HT) content. 5. Improved Granularity: Since many features that scholars are interested in exist at the conceptual level of a page rather than at the level of a volume, unique page-level entities expand the types of methods by which worksets can be gathered and by which analytics modules can be constructed. 6. Expanded Workset Membership: In the near future we would like to empower scholars with options for creating worksets from arbitrary resources at arbitrary levels of granularity, including constructing worksets from collections of arbitrary pages. 7. Supporting Graph Representations: Unique identifiers for page-level content facilitate the creation of more conceptually accurate and functional graph representations of the HT corpus. There several ways
- Publisher
- University of Illinois
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/73147
Owning Collections
Student Publications and Research - Information Sciences PRIMARY
Publications, conference papers, and other research and scholarship of iSchool students.Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…