首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Robust Document Image Dewarping Method Using Text-Lines and Line Segments
【24h】

Robust Document Image Dewarping Method Using Text-Lines and Line Segments

机译:使用文本线条和线段的强大文档图像露天度方法

获取原文

摘要

Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, we propose to use line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that many of the line segments in the image are horizontally or vertically aligned in the well-rectified images, we encode this property into the cost function in addition to the text-line alignment cost. By minimizing the function, we can obtain transformation parameters for camera pose, page curve, etc., which are used for document rectification. Considering that there are many outliers in line segment directions and missed text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, we remove text components and line segments that are not well aligned, and then minimize the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts.
机译:传统的基于文本的在线文档去扭曲方法处理复杂的布局和/或很少的文本行时遇到问题。当存在几个对齐文本线在图像中,这通常意味着照片,图形和/或表格获取输入的大的部分来代替。因此,对于强大的文档去扭曲,我们建议除了对齐文本,线条中使用的线段的形象。基于这样的假设,并观察到很多的图像中的线段的水平或垂直在公整流图像对齐,我们除了文本行对准成本编码该属性到成本函数。通过最小化功能,我们可以得到相机的姿势,页曲线等,这是用于文件整改转换参数。考虑到有许多异常的线段的方向和在某些情况下,错过了文本的线条,整体算法设计迭代方式。在每一步中,我们会删除文本组件和线段之间没有很好地对齐,然后尽量减少更新信息的成本函数。实验结果表明,该方法是稳健的各种页面布局。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号