Mr Stefan Pletschacher S.Pletschacher@salford.ac.uk
Lecturer
The PAGE (Page Analysis and Ground-Truth Elements) format framework
Pletschacher, S; Antonacopoulos, A
Authors
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
Abstract
There is a plethora of established and proposed document representation formats but none that can adequately support individual stages within an entire sequence of document image analysis methods (from document image enhancement to layout analysis to OCR) and their evaluation. This paper describes PAGE, a new XML-based page image representation framework that records information on image characteristics (image borders, geometric distortions and corresponding corrections, binarisation etc.) in addition to layout structure and page content. The suitability of the framework to the evaluation of entire workflows as well as individual stages has been extensively validated by using it in high-profile applications such as in public contemporary and historical ground-truthed datasets and in the ICDAR Page Segmentation competition series.
Citation
Pletschacher, S., & Antonacopoulos, A. (2010). The PAGE (Page Analysis and Ground-Truth Elements) format framework. In 2010 20th International Conference on Pattern Recognition. https://doi.org/10.1109/ICPR.2010.72
Conference Name | 20th International Conference on Pattern Recognition (ICPR2010) |
---|---|
Conference Location | Istanbul, Turkey |
Start Date | Aug 23, 2010 |
End Date | Aug 26, 2010 |
Online Publication Date | Oct 7, 2010 |
Publication Date | Aug 26, 2010 |
Deposit Date | Oct 7, 2011 |
Publicly Available Date | Apr 5, 2016 |
Book Title | 2010 20th International Conference on Pattern Recognition |
ISBN | 9781424475421 |
DOI | https://doi.org/10.1109/ICPR.2010.72 |
Publisher URL | http://dx.doi.org/10.1109/ICPR.2010.72 |
Related Public URLs | https://ieeexplore.ieee.org/xpl/conhome/5595335/proceeding |
Files
PAGE.pdf
(183 Kb)
PDF
You might also like
A survey of OCR evaluation tools and metrics
(2021)
Conference Proceeding
VISE : an interface for Visual Search and Exploration of museum collections
(2019)
Journal Article
Efficient and effective OCR engine training
(2019)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search