首页> 外文会议>ECCV 2010;European conference on computer vision >State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction
【24h】

State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction

机译:文档图像中的状态估计及其在文本块识别和文本行提取中的应用

获取原文

摘要

This paper proposes a new approach to the estimation of document states such as interline spacing and text line orientation, which facilitates a number of tasks in document image processing. The proposed method can be applied to spatially varying states as well as invariant ones, so that general cases including images of complex layout, camera-captured images, and handwritten ones can also be handled. Specifically, we find CCs (Connected Components) in a document image and assign a state to each of them. Then the states of CCs are estimated using an energy minimization framework, where the cost function is designed based on frequency domain analysis and minimized via graph-cuts. Using the estimated states, we also develop a new algorithm that performs text block identification and text line extraction. Roughly speaking, we can segment an image into text blocks by cutting the distant connections among the CCs (compared to the estimated interline spacing), and we can group the CCs into text lines using a bottom-up grouping along the estimated text line orientation. Experimental results on a variety of document images show that our method is efficient and provides promising results in several document image processing tasks.
机译:本文提出了一种新的估计文档状态的方法,例如行间距和文本行方向,它可以简化文档图像处理中的许多任务。所提出的方法可以应用于空间变化的状态以及不变的状态,从而也可以处理包括复杂布局的图像,相机捕获的图像和手写状态在内的一般情况。具体来说,我们在文档图像中找到CC(连接的组件),并为每个CC分配一个状态。然后,使用能量最小化框架估算CC的状态,其中基于频域分析设计成本函数,并通过图割将其最小化。使用估计的状态,我们还开发了一种新的算法,可以执行文本块识别和文本行提取。粗略地说,我们可以通过剪切CC之间的远距离连接(与估计的行间间距相比)将图像分割为文本块,然后可以使用CC沿估计的文本行方向使用自下而上的分组将CC分组为文本行。在各种文档图像上的实验结果表明,我们的方法是有效的,并在一些文档图像处理任务中提供了有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号