Withdraw
Loading…
Augmenting optical character recognition (OCR) for improved digitization: Strategies to access scientific data in natural history collections
Paul, Deborah L.; Heidorn, P. Bryan
Loading…
Permalink
https://hdl.handle.net/2142/39427
Description
- Title
- Augmenting optical character recognition (OCR) for improved digitization: Strategies to access scientific data in natural history collections
- Author(s)
- Paul, Deborah L.
- Heidorn, P. Bryan
- Issue Date
- 2013-02
- Keyword(s)
- iDigBio
- OCR
- natural language
- information analysis
- machine language
- information organization
- information services
- research methods
- information retrieval
- qualitative data analysis
- Abstract
- The Augmenting OCR Working Group (A-OCR WG) at Integrated Digitized Biocollections (iDigBio) seeks to improve community OCR strategies and algorithms for faster, better parsing of OCR output derived from valuable data on natural history collection specimen labels. This task is exceedingly difficult because museum labels are often annotated, and vary in content, form and font. Under the National Science Foundation's (NSF) Advancing Digitization of Biological Collections (ADBC) program, iDigBio is building a cyberinfrastructure to aggregate quality data from museum specimens housed in collections across the United States for use by researchers, educators, environmentalists and the public. Since March of 2012, the A-OCR WG formed from community consensus to begin its role in this endeavor, defining reachable goals including setting up a hackathon concurrent with iConference 2013. This paper reports on the definition of some key problems identified by the A-OCR WG since these science problems will drive research and cyberinfrastructure development.
- Publisher
- iSchools
- Type of Resource
- text
- Language
- en
- Permalink
- http://hdl.handle.net/2142/39427
- DOI
- https://doi.org/10.9776/13266
- Copyright and License Information
- Copyright © 2013 is held by the authors. Copyright permissions, when appropriate, must be obtained directly from the authors.
Owning Collections
iConference 2013 notes PRIMARY
Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…