首页> 外文会议>International conference on database and expert systems applications >Semantic Matching over Matrix-Style Tables in Richly Formatted Documents
【24h】

Semantic Matching over Matrix-Style Tables in Richly Formatted Documents

机译:在丰富的格式化文档中矩阵样式表的语义匹配

获取原文

摘要

Table is an efficient way to represent a huge number of facts in a compact manner. As practitioners in the vertical domain share lots of common prior knowledge, they tend to represent facts more concisely using matrix-style tables. However, such tables are originally intended for human reading, but not machine-readable due to their complex structures including row header, column header, metadata, external context, and even hierarchies in headers. In order to improve the efficiency of practitioners in mining and utilizing these matrix-style tables, in this study we introduce a challenging task to discover fact-overlapping relations between matrix-style tables. This relation focuses on fine-grained local semantics instead of overall relatedness in conventional tasks. We propose an attention-based model for this task. Experiments reveal that our model is more capable of discovering the local relatedness, and outperforms four baseline methods. We also conduct an ablation study and case study to investigate our model in detail.
机译:表是以紧凑的方式表示大量事实的有效方法。随着垂直域中的从业者分享了许多常见的先验知识,它们倾向于使用矩阵式表更简单地表示事实。但是,这些表最初是用于人类读数,但由于其复杂的结构,而不是机器可读,包括包括行标题,列标题,元数据,外部上下文甚至标题中的层次结构。为了提高从业者在挖掘和利用这些矩阵式表中的效率,在本研究中,我们介绍了一个具有挑战性的任务,以发现矩阵式表之间的实际重叠关系。这一关系侧重于常规任务中的细粒度局部语义而不是整体相关性。我们提出了一个基于关注的模型为此任务。实验表明,我们的模型更能发现本地相关性,并且优于四种基线方法。我们还开展了一个消融的研究和案例研究,详细研究了我们的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号