Dr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
Dr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
S Pletschacher
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
Reading order detection and representation is an important task in many digitisation scenarios involving the preservation of the logical structure of a document. The corresponding need for the evaluation of reading order results generated by layout analysis methods poses a particular challenge due to the potential deviations between the ground truth and actually detected segmentation of the page. To this end a novel evaluation approach that responds to this problem by incorporating region correspondence analysis is proposed. Furthermore, a sophisticated reading order representation scheme is presented and used by the system allowing the grouping of objects with ordered and/or unordered relations. This is a typical requirement for documents with complex layouts such as magazines and newspapers. The evaluation method has been validated using the results of two state-of-the-art OCR / layout analysis systems and a basic top-to-bottom reading order detection algorithm applied on representative samples from the PRImA contemporary and the IMPACT historical document datasets.
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 12th International Conference on Document Analysis and Recognition |
Start Date | Aug 25, 2013 |
End Date | Aug 28, 2013 |
Online Publication Date | Oct 15, 2013 |
Publication Date | Aug 1, 2013 |
Deposit Date | Sep 23, 2013 |
Book Title | Proceedings of the 2013 12th International Conference on Document Analysis and Recognition |
ISBN | 9780769549996 |
DOI | https://doi.org/10.1109/ICDAR.2013.141 |
Publisher URL | https://doi.org/10.1109/ICDAR.2013.141 |
Related Public URLs | http://www.primaresearch.org/papers/ICDAR2013_Clausner_ReadingOrder.pdf http://www.computer.org/portal/web/guest/home |
Efficient and effective OCR engine training
(2019)
Journal Article
The ENP image and ground truth dataset of historical newspapers
(-0001)
Book Chapter
A survey of OCR evaluation tools and metrics
(2021)
Presentation / Conference Contribution
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search