Workset Creation for Scholarly Analysis and Data Capsules (WCSA+DC): Laying the foundations for secure computation with copyrighted data in the HathiTrust Research Center, Phase I
Downie, J. Stephen; Plale, Beth; McDonald, Robert; Namachchivaya, Beth Sandore; Unsworth, John; Cole, Timothy W.
Loading…
Permalink
https://hdl.handle.net/2142/99010
Description
Title
Workset Creation for Scholarly Analysis and Data Capsules (WCSA+DC): Laying the foundations for secure computation with copyrighted data in the HathiTrust Research Center, Phase I
Author(s)
Downie, J. Stephen
Plale, Beth
McDonald, Robert
Namachchivaya, Beth Sandore
Unsworth, John
Cole, Timothy W.
Contributor(s)
Dubnicek, Ryan
Ma, Yu "Marie"
Underwood, Ted
Pustejovsky, James
Verhagen, Marc
Hinze, Annika
Page, Kevin
Green, Harriett
Issue Date
2015-12-16
Keyword(s)
HTRC
HathiTrust
worksets
data capsules
text data mining
digital humanities
non-consumptive research
secure computing
computational linguistics
Abstract
The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access HathiTrust’s massive corpus of data objects, securely and at scale, regardless of copyright status. That is, we plan to surmount the copyright wall on behalf of scholars and their students.
Notwithstanding the substantial preliminary work that has been done on both the WCSA and DC fronts, they are both still best characterized as being in the prototyping stages. It is our intention to that this proposed Phase I of the project devote an intense two-year burst of effort to move the suite of WCSA and DC prototypes from the realm of proof-of-concept to that of a firmly integrated at-scale deployment. We plan to concentrate our requested resources on making sure our systems are as secure and robust at scale as possible.
Phase I will engage four external research partners. Two of the external partners, Kevin Page (Oxford) and Annika Hinze (Waikato) were recipients of WCSA prototyping sub-awards. We are very glad to propose extending and refining aspects of their prototyping work in the context of WCSA+DC. Two other scholars, Ted Underwood (Illinois) and James Pustejovsky (Brandeis) will play critical roles in Phase I as active participants in the development and refinement of the tools and systems from their particular user-scholar perspectives: Underwood, Digital Humanities (DH); Pustejovsky, Computational Linguistics (CL).
The four key outcomes and benefits of the WCSA+DC, Phase I project are:
1. The deployment of a new Workset Builder tool that enhances search and discovery across the entire HTDL by complementing traditional volume-level bibliographic metadata with new metadata derived from a variety of sources at various levels granularity.
2. The creation of Linked Open Data resources to help scholars find, select, integrate and disseminate a wider range of data as part of their scholarly analysis life-cycle.
3. A new Data Capsule framework that integrates worksets, runs at scale, and does both in a secure, non-consumptive, manner.
4. A set of exemplar pre-built Data Capsules that incorporate tools commonly used by both the DH and CL communities that scholars can then customize to their specific needs.
This is the default collection for all research and scholarship developed by faculty, staff, or students at the University of Illinois at Urbana-Champaign
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.