Mr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
Mr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
J Hayes
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
This paper describes how crowdsourcing can be incorporated as an integral part of a comprehensive technical workflow to identify, extract and validate data from large volumes of printed tabular statistics, and transform them into operable digital datasets using current structural and descriptive standards. The recently completed digitisation project for the 1961 Census of England and Wales (commissioned by the UK's Office for National Statistics) is used to provide details on data processing, crowdsourcing platform and tasks, crowd interaction, and validation of results. The multi-modal approach employed was very successful, delivering far more complete and validated data than automated processes alone could produce (due to the challenging nature of the source material).
Clausner, C., Hayes, J., & Antonacopoulos, A. (2019). Crowdsourcing historical tabular data : 1961 census of England and Wales. In Proceedings of the 5th International Workshop on Historical Document Imaging and Processing - HIP '19. https://doi.org/10.1145/3352631.3352643
Conference Name | 5th International Workshop on Historical Document Imaging and Processing - HIP'19 |
---|---|
Conference Location | Sydney, Australia |
Start Date | Sep 20, 2019 |
End Date | Sep 21, 2019 |
Online Publication Date | Sep 20, 2019 |
Publication Date | Sep 20, 2019 |
Deposit Date | Nov 12, 2019 |
Publicly Available Date | Nov 12, 2019 |
Series Title | ACM International Conference Proceeding Series: HIP: Historical Document Imaging and Processing |
Series Number | 02155 |
Book Title | Proceedings of the 5th International Workshop on Historical Document Imaging and Processing - HIP '19 |
ISBN | 9781450376686 |
DOI | https://doi.org/10.1145/3352631.3352643 |
Publisher URL | https://doi.org/10.1145/3352631.3352643 |
Related Public URLs | https://www.primaresearch.org/hip2019/ https://dl.acm.org/citation.cfm?id=3352631&picked=prox |
Crowdsourcing historical tabular data - usir.pdf
(714 Kb)
PDF
Text line segmentation from struck-out handwritten document images
(2022)
Journal Article
A new deep wavefront based model for text localization in 3D video
(2021)
Journal Article
A survey of OCR evaluation tools and metrics
(2021)
Conference Proceeding
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search