首页> 外文会议>Indian conference on vision, graphics and image processing >Text and Non-text Separation in Scanned Color-Official Documents
【24h】

Text and Non-text Separation in Scanned Color-Official Documents

机译:扫描色彩官方文件中的文本和非文本分离

获取原文

摘要

Official documents consist of text and non-textual elements such as logo, stamp, and signature. Separation of these elements from a scanned document plays a significant role in document image retrieval, recognition, and verification. This paper presents a novel scheme to separate text and non-text elements of official documents using part-based features. In this work, we exploit the fact that intensity distributions of text and non-text elements in HSV color space are of distinctive nature. A new approach to compute part-based features using S and V channels is proposed. The classification of text and non-text components is performed based on majority voting scheme and K-approximate nearest neighbors. The knowledge base acquired during training is indexed using kD-tree indexing scheme. Subsequently, the method is extended for detection of logo, stamp, and signature. Experimental results show the effectiveness of the proposed approach.
机译:官方文件由文本和非文本元素组成,如徽标,邮票和签名。从扫描文档中分离这些元素在文档图像检索,识别和验证中起着重要作用。本文介绍了一种新颖的方案,可以使用基于零件的功能分别官方文件的文本和非文本元素。在这项工作中,我们利用了HSV颜色空间中文本和非文本元素的强度分布的事实是具有独特性的。提出了一种使用S和V通道计算基于零件特征的新方法。文本和非文本组件的分类是基于多数投票方案和K近似最近邻居的。在培训期间获得的知识库使用KD-Tree索引方案索引。随后,延长该方法以检测徽标,印章和签名。实验结果表明了拟议方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号