首页> 外文会议>International Conference on Document Analysis and Recognition >ReS2TIM: Reconstruct Syntactic Structures from Table Images
【24h】

ReS2TIM: Reconstruct Syntactic Structures from Table Images

机译:ReS2TIM:从表图像重建语法结构

获取原文

摘要

Tables often represent densely packed but structured data. Understanding table semantics is vital for effective information retrieval and data mining. Unlike web tables, whose semantics are readable directly from markup language and contents, the full analysis of tables published as images requires the conversion of discrete data into structured information. This paper presents a novel framework to convert a table image into its syntactic representation through the relationships between its cells. In order to reconstruct the syntactic structures of a table, we build a cell relationship network to predict the neighbors of each cell in four directions. During the training stage, a distance-based sample weight is proposed to handle the class imbalance problem. According to the detected relationships, the table is represented by a weighted graph that is then employed to infer the basic syntactic table structure. Experimental evaluation of the proposed framework using two datasets demonstrates the effectiveness of our model for cell relationship detection and table structure inference.
机译:表通常表示密集包装但结构化的数据。了解表语义对于有效的信息检索和数据挖掘至关重要。与Web表不同,Web表的语义可以直接从标记语言和内容中读取,而对作为图像发布的表的完整分析则需要将离散数据转换为结构化信息。本文提出了一种新颖的框架,可以通过表格单元之间的关系将表格图像转换成其语法表示形式。为了重建表格的句法结构,我们构建了一个单元关系网络,以预测四个方向上每个单元的邻居。在训练阶段,提出了基于距离的样本权重,以解决班级不平衡问题。根据检测到的关系,该表由加权图表示,然后将其用于推断基本句法表结构。使用两个数据集对提出的框架进行的实验评估证明了我们的模型对单元格关系检测和表结构推断的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号