首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Robust Document Image Dewarping Method Using Text-Lines and Line Segments
【24h】

Robust Document Image Dewarping Method Using Text-Lines and Line Segments

机译:使用文本行和线段的鲁棒文档图像变形方法

获取原文

摘要

Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, we propose to use line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that many of the line segments in the image are horizontally or vertically aligned in the well-rectified images, we encode this property into the cost function in addition to the text-line alignment cost. By minimizing the function, we can obtain transformation parameters for camera pose, page curve, etc., which are used for document rectification. Considering that there are many outliers in line segment directions and missed text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, we remove text components and line segments that are not well aligned, and then minimize the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts.
机译:当处理复杂的布局和/或很少的文本行时,传统的基于文本行的文档变形方法会出现问题。当图像中对齐的文本行很少时,这通常意味着照片,图形和/或表格会替代输入的很大一部分。因此,对于健壮的文档变形,我们建议除了对齐的文本行之外,还使用图像中的行段。基于对图像的许多线段在经过正确校正的图像中水平或垂直对齐的假设和观察,除文本行对齐成本外,我们将此属性编码为成本函数。通过最小化该功能,我们可以获得相机姿态,页面曲线等的转换参数,这些参数用于文档校正。考虑到在某些情况下,线段方向上存在许多离群值,并且缺少文本行,因此以迭代方式设计了整个算法。在每个步骤中,我们都会删除未很好对齐的文本组件和线段,然后使用更新的信息最小化成本函数。实验结果表明,该方法对多种页面布局均具有较强的鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号