Name Matters: Taxonomic Name Recognition (TNR) in Biodiversity Heritage Library (BHL)
Wei, Qin; Heidorn, P. Bryan; Freeland, Chris
Loading…
Permalink
https://hdl.handle.net/2142/14919
Description
Title
Name Matters: Taxonomic Name Recognition (TNR) in Biodiversity Heritage Library (BHL)
Author(s)
Wei, Qin
Heidorn, P. Bryan
Freeland, Chris
Issue Date
2010-02-03
Keyword(s)
Taxonomic Name Recognition
TNR
biodiversity informatics
Machine Learning
Digital Libraries
Information Retrieval
Abstract
Taxonomic Name Recognition is prerequisite for more advanced
processing and mining of full-text taxonomic literatures.
This paper investigates three issues of current TNR
tools in detail: (1) The difficulties and methods used in
TNRs. (2) The performance of Optical Character Recognition
(OCR) and TNR tools by samples from Biodiversity
Heritage Library (BHL). (3) The methods for potential improvement.
We found that the performances of current TNR
techniques need to be improved. A detailed error analysis
reveals that sublanguage characteristics account for much of
the error. A preliminary experiment using NaiveBayes (NB)
models shows the potential of using machine learning (ML)
in TNR.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.