Mr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
Mr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
J Hayes
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
Mr Stefan Pletschacher S.Pletschacher@salford.ac.uk
Lecturer
The 1961 Census of England and Wales was the first UK census to make use of computers. However, only bound volumes and microfilm copies of printouts remain, locking a wealth of information in a form that is practically unusable for research. In this paper, we describe process of creating the digitisation workflow that was developed as part of a pilot study for the Office for National Statistics. The emphasis of the paper is on the issues originating from the historical nature of the material and how they were resolved. The steps described include image pre-processing, OCR setup, table recognition, post-processing, data ingestion, crowdsourcing, and quality assurance. Evaluation methods and results are presented for all steps.
Clausner, C., Hayes, J., Antonacopoulos, A., & Pletschacher, S. (2017). Creating a complete workflow for digitising historical census documents : considerations and evaluation. . https://doi.org/10.1145/3151509.3151525
Conference Name | 2017 Workshop on Historical Document Imaging and Processing (HIP2017) |
---|---|
Conference Location | Kyoto, Japan |
Start Date | Nov 10, 2017 |
End Date | Nov 11, 2017 |
Publication Date | Nov 11, 2017 |
Deposit Date | Nov 20, 2017 |
Publicly Available Date | Nov 21, 2017 |
ISBN | 9781450353908 |
DOI | https://doi.org/10.1145/3151509.3151525 |
Related Public URLs | http://events.unifr.ch/hip2017/ |
HIP2017 - Census 1961 camera-ready 3.pdf
(1 Mb)
PDF
Version
Author's accepted manuscript
Text line segmentation from struck-out handwritten document images
(2022)
Journal Article
A new deep wavefront based model for text localization in 3D video
(2021)
Journal Article
A survey of OCR evaluation tools and metrics
(2021)
Conference Proceeding
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search