Human-assisted OCR of Japanese books with different kinds of microtasks
Author(s)
Ikeda, Kosetsu
Hayashi, Ryota
Nagasaki, Kiyonori
Morishima, Atsuyuki
Issue Date
2017
Keyword(s)
Digital transcription
Crowdsourcing
Microtasks
Abstract
Human-assisted OCR is a common approach for transcribing books and
has been used for many digital library projects.
This paper reports our project for transcribing the book collections of National Diet Library in this approach.
Our project is unique in two ways. First,
we try to extend the human-assisted OCR approach by distributing microtasks in many ways other than just showing tasks in the specific Web page on PC screens.
Second, we deal with Japanese books which have thousands of characters,
some of which look similar to each other.
This paper shows that we can expect high-quality results even if we transcribe Japanese texts with microtasks
and the number of preformed microtasks to be stable if we distribute microtasks to equipment with witch worker perform microtasks in their daily lives.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.