Skip to main content

Research Repository

Advanced Search

Outputs (18)

Efficient and effective OCR engine training (2019)
Journal Article
Clausner, C., Antonacopoulos, A., & Pletschacher, S. (2020). Efficient and effective OCR engine training. International Journal on Document Analysis and Recognition, 23(1), 73-78. https://doi.org/10.1007/s10032-019-00347-8

We present an efficient and effective approach to train OCR engines using the Aletheia document analysis system. All components required for training are seamlessly integrated into Aletheia: training data preparation, the OCR engine’s training proces... Read More about Efficient and effective OCR engine training.

Creating a complete workflow for digitising historical census documents : considerations and evaluation (2017)
Presentation / Conference Contribution

The 1961 Census of England and Wales was the first UK census to make use of computers. However, only bound volumes and microfilm copies of printouts remain, locking a wealth of information in a form that is practically unusable for research. In this... Read More about Creating a complete workflow for digitising historical census documents : considerations and evaluation.

Unearthing the recent past : digitising and understanding statistical information from census tables (2017)
Presentation / Conference Contribution

Censuses comprise a wealth of information at a large (national) scale that allow governments (who commission them) and the public to have a detailed snapshot of how people live (geographical distribution and characteristics). In addition to underpinn... Read More about Unearthing the recent past : digitising and understanding statistical information from census tables.

Effective geometric restoration of distorted historical documents for large-scale digitization (2017)
Journal Article
Yang, P., Antonacopoulos, A., Clausner, C., Pletschacher, S., & Qi, J. (2017). Effective geometric restoration of distorted historical documents for large-scale digitization. IET Image Processing, 11(10), 841-853. https://doi.org/10.1049/iet-ipr.2016.0973

Due to storage conditions and material’s non-planar shape, geometric distortion of the 2-D content is widely present in scanned document images. Effective geometric restoration of these distorted document images considerably increases character recog... Read More about Effective geometric restoration of distorted historical documents for large-scale digitization.