...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Distinction between handwritten and machine-printed text based on the bag of visual words model
【24h】

Distinction between handwritten and machine-printed text based on the bag of visual words model

机译:基于视觉单词袋模型的手写文本与机器打印文本的区别

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In a variety of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may coexist in the same document image, raising significant issues within the recognition pipeline. It is, therefore, necessary to separate the two types of text so that it becomes feasible to apply different recognition methodologies to each modality. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words model (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten, Machine Printed or Noise is made by a decision scheme which relies upon the combination of binary SVM classifiers. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with new datasets dedicated to the problem upon consideration.
机译:在各种文档中,从表格到带有注释的存档文档和书籍,机器打印和手写文本可能共存于同一文档图像中,从而在识别管道中引发了重大问题。因此,有必要将两种类型的文本分开,以便对每种形式应用不同的识别方法变得可行。在本文中,提出了一种新方法,该方法致力于使用“视觉单词袋”模型(BoVW)识别并从机器打印的文本中分离出手写内容。最初,在文档图像中检测到感兴趣的块。对于每个块,基于BoVW计算描述符。这些块的最终特征是“手写”,“机器打印”或“噪声”是由一个决策方案决定的,该决策方案依赖于二进制SVM分类器的组合。通过使用一致的评估方法来展示所提出方法的有希望的性能,该方法将有意义的度量与考虑到问题的专用新数据集相结合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号