首页> 外文期刊>International Journal on Document Analysis and Recognition >Junction-based table detection in camera-captured document images
【24h】

Junction-based table detection in camera-captured document images

机译:相机捕获的文档图像中基于结的表检测

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we present a method that locates tables and their cells in camera-captured document images. In order to deal with this problem in the presence of geometric and photometric distortions, we develop new junction detection and labeling methods, where junction detection means to find candidates for the corners of cells, and junction labeling is to infer their connectivity. We consider junctions as the intersections of curves, and so we first develop a multiple curve detection algorithm. After the junction detection, we encode the connectivity information (including false detection) between the junctions into 12 labels, and design a cost function reflecting pairwise relationships as well as local observations. The cost function is minimized via the belief propagation algorithm, and we can locate tables and their cells from the inferred labels. Also, in order to handle multiple tables in a single page, we propose a table area detection method. Our method is based on the well-known recursive X-Y cut, however, we modify the method so that we can also deal with curved seams caused by the geometric distortions. For the evaluation of our method, we build a data set that includes a variety of camera-captured table images and make the set publicly available. Experimental results on the set show that our method successfully locates tables and their cells in camera-captured images.
机译:在本文中,我们提出了一种在照相机捕获的文档图像中定位表格及其单元格的方法。为了在存在几何和光度学失真的情况下解决此问题,我们开发了新的接合点检测和标记方法,其中接合点检测意味着找到细胞角的候选对象,接合点标记是推断它们的连通性。我们将交点视为曲线的交点,因此我们首先开发了多曲线检测算法。进行路口检测后,我们将路口之间的连通性信息(包括虚假检测)编码为12个标签,并设计一个反映成对关系以及局部观测值的成本函数。通过置信传播算法将成本函数最小化,我们可以从推断的标签中找到表格及其单元格。另外,为了在单个页面中处理多个表,我们提出了一种表区域检测方法。我们的方法基于众所周知的递归X-Y切割,但是我们对其进行了修改,以便我们还可以处理由几何变形引起的弯曲接缝。为了评估我们的方法,我们构建了一个数据集,其中包含各种相机捕获的表格图像,并使该数据集公开可用。在集合上的实验结果表明,我们的方法成功地在摄像机捕获的图像中找到了表格及其单元格。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号