University of Illinois Urbana-Champaign

The Gutenberg-HathiTrust Parallel Corpus: A Real-World Dataset for Noise Investigation in Uncorrected OCR Texts

Jiang, Ming; Hu, Yuerong; Worthey, Glen; Dubnicek, Ryan C.; Capitanu, Boris; Kudeki, Deren; Downie, J. Stephen

Loading…

Permalink

Description

Owning Collections