首页> 外文期刊>International Journal on Document Analysis and Recognition (IJDAR) >Coupled snakelets for curled text-line segmentation from warped document images
【24h】

Coupled snakelets for curled text-line segmentation from warped document images

机译:耦合的小蛇从弯曲的文档图像中进行卷曲的文本行分割

获取原文
获取原文并翻译 | 示例
           

摘要

Camera-captured, warped document images usually contain curled text-lines because of distortions caused by camera perspective view and page curl. Warped document images can be transformed into planar document images for improving optical character recognition accuracy and human readability using monocular dewarping techniques. Curled text-lines segmentation is a crucial initial step for most of the monocular dewarping techniques. Existing curled text-line segmentation approaches are sensitive to geometric and perspective distortions. In this paper, we introduce a novel curled text-line segmentation algorithm by adapting active contour (snake). Our algorithm performs text-line segmentation by estimating pairs of x-line and baseline. It estimates a local pair of x-line and baseline on each connected component by jointly tracing top and bottom points of neighboring connected components, and finally each group of overlapping pairs is considered as a segmented text-line. Our algorithm has achieved curled text-line segmentation accuracy of above 95% on the DFKI-I (CBDAR 2007 dewarping contest) dataset, which is significantly better than previously reported results on this dataset.
机译:由于相机透视图和页面卷曲导致的变形,因此相机捕获的,变形的文档图像通常包含卷曲的文本行。可以使用单眼变形技术将变形的文档图像转换为平面文档图像,以提高光学字符识别的准确性和人的可读性。卷曲的文本行分割对于大多数单眼变形技术而言都是至关重要的初始步骤。现有的卷曲文本行分割方法对几何和透视图变形很敏感。在本文中,我们通过自适应活动轮廓(蛇)介绍了一种新颖的卷曲文本行分割算法。我们的算法通过估计成对的x线和基线来执行文本线分割。它通过共同跟踪相邻连接组件的顶部和底部点来估计每个连接组件上的局部x线和基线对,最后将每对重叠的对视为一组分段文本线。我们的算法在DFKI-I(CBDAR 2007变形竞赛)数据集上实现了超过95%的卷曲文本行分割精度,这比以前在该数据集上报告的结果要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号