K Zagoris
Distinction between handwritten and machine-printed text based on the bag of visual words model
Zagoris, K; Pratikakis, I; Antonacopoulos, A; Gatos, B; Papamarkos, N
Authors
I Pratikakis
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
B Gatos
N Papamarkos
Abstract
In a variety of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may coexist in the same document image, raising significant issues within the recognition pipeline. It is, therefore, necessary to separate the two types of text so that it becomes feasible to apply different recognition methodologies to each modality. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words model (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten, Machine Printed or Noise is made by a decision scheme which relies upon the combination of binary SVM classifiers. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with new datasets dedicated to the problem upon consideration.
Citation
Zagoris, K., Pratikakis, I., Antonacopoulos, A., Gatos, B., & Papamarkos, N. (2014). Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern recognition, 47(3), 1051-1062. https://doi.org/10.1016/j.patcog.2013.09.005
Journal Article Type | Article |
---|---|
Publication Date | Jan 1, 2014 |
Deposit Date | May 7, 2014 |
Journal | Pattern Recognition |
Print ISSN | 0031-3203 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 47 |
Issue | 3 |
Pages | 1051-1062 |
DOI | https://doi.org/10.1016/j.patcog.2013.09.005 |
Keywords | Bag of visual words; Local features; Support vector machines; Page layout |
Publisher URL | http://dx.doi.org/10.1016/j.patcog.2013.09.005 |
Related Public URLs | http://www.elsevier.com/wps/product/cws_home/328/description |
You might also like
A new deep CNN for 3D text localization in the wild through shadow removal
(2023)
Journal Article
NAME – A Rich XML Format for Named Entity and Relation Tagging
(2023)
Presentation / Conference Contribution