Withdraw
Loading…
Text Mining Scholarly Publications Using APIs
Sarraf, Ishita
Loading…
Permalink
https://hdl.handle.net/2142/120049
Description
- Title
- Text Mining Scholarly Publications Using APIs
- Author(s)
- Sarraf, Ishita
- Contributor(s)
- Schneider, Jodi
- Fu , Yuanxi
- Issue Date
- 2023-07-26
- Keyword(s)
- text mining, scholarly publications, full text, requirements analysis, scholarly data mining, XML
- Abstract
- Text mining is a tool that researchers use to analyze their own custom datasets of scholarly publications. However, digital publications come with copyright licenses, and many are not open access, creating obstacles for researchers such as downloading the full text of the papers. In this talk, I describe my work constructing a pipeline that will download full texts of scholarly publications to help researchers create their own custom datasets. My pipeline will be reusable such that given any Digital Object Identifier (DOI) of scholarly papers it can extract the papers' PDF and XML full texts, if available, and store them in a database. To extract the full text under various copyright licenses, I will use text and data mining APIs supplied by Crossref, Elsevier, and Wiley. I will also identify scientific analysis tasks that can be done after the extraction by interviewing members of my lab as part of my requirements analysis. The full text extraction pipeline is important because it allows different datasets of scholarly publications to be created using a single pipeline, making it easier for researchers to construct their custom datasets without wasting time on copyright licenses.
- Has Part
- https://github.com/infoqualitylab/text-mining-scholarly-API
- Type of Resource
- Presentation
- Language
- en
- Sponsor(s)/Grant Number(s)
- NSF 2046454
- NSF BPC-A #1246649
- Copyright and License Information
- Ishita Sarraf
Owning Collections
Student Publications and Research - Information Sciences PRIMARY
Publications, conference papers, and other research and scholarship of iSchool students.Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…