Skip to main content

Research Repository

Advanced Search

All Outputs (24)

Document representation refinement for precise region description (2014)
Conference Proceeding
Clausner, C., Pletschacher, S., & Antonacopoulos, A. (2014). Document representation refinement for precise region description. In A. Antonacopoulos, & K. Schulz (Eds.), DATeCH '14: Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage. https://doi.org/10.1145/2595188.2595198

Precise description of layout entities (content regions on a page) is crucial for all but the most trivial document analysis and recognition applications. The output of layout analysis methods and state-of-the-art OCR systems varies significantly, fr... Read More about Document representation refinement for precise region description.

The significance of reading order in document recognition and its evaluation (2013)
Conference Proceeding
Clausner, C., Pletschacher, S., & Antonacopoulos, A. (2013). The significance of reading order in document recognition and its evaluation. In Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. https://doi.org/10.1109/ICDAR.2013.141

Reading order detection and representation is an important task in many digitisation scenarios involving the preservation of the logical structure of a document. The corresponding need for the evaluation of reading order results generated by layout a... Read More about The significance of reading order in document recognition and its evaluation.

Aletheia - An advanced document layout and text ground-truthing system for production environments (2011)
Conference Proceeding
Clausner, C., Pletschacher, S., & Antonacopoulos, A. (2011). Aletheia - An advanced document layout and text ground-truthing system for production environments. In 2011 International Conference on Document Analysis and Recognition ICDAR 2011. https://doi.org/10.1109/ICDAR.2011.19

Large-scale digitisation has led to a number of new possibilities with regard to adaptive and learning based methods in the field of Document Image Analysis and OCR. For ground truth production of large corpora, however, there is still a gap in terms... Read More about Aletheia - An advanced document layout and text ground-truthing system for production environments.

The ENP image and ground truth dataset of historical newspapers
Book Chapter
Clausner, C., Papadopoulos, C., Pletschacher, S., & Antonacopoulos, A. The ENP image and ground truth dataset of historical newspapers. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR) (931-935). IEEE-CPS. https://doi.org/10.1109/ICDAR.2015.7333898

This paper presents a research dataset of historical newspapers comprising over 500 page images, uniquely representative of European cultural heritage from the digitization projects of 12 national and major European libraries, created within the scop... Read More about The ENP image and ground truth dataset of historical newspapers.