首页> 外文期刊>Engineering Applications of Artificial Intelligence >Holistic design for deep learning-based discovery of tabular structures in datasheet images
【24h】

Holistic design for deep learning-based discovery of tabular structures in datasheet images

机译:基于深度学习的整体设计,可发现数据表图像中的表格结构

获取原文
获取原文并翻译 | 示例
           

摘要

Extracting data from tabular structures contained within product datasheets is crucial in many contexts, particularly in the management and optimization of supply chains that serve various industries. In order to minimize human intervention, table detection and table structure detection form the essential functionality. However, a self-contained holistic solution to extract the tables as well as their columns and rows in not readily available. To address this challenge, This study presents a new formal procedure that consists of the following sequence: table detection, structure segmentation and holistic tabular structure detection on documents. The proposed table detection model outperforms the state-of-the-art solutions by achieving a recall value of 1.0 and a precision of more than 0.99 on public competition datasets. Furthermore, this work introduces a judging mechanism and an agreement-based post-processing procedure to incorporate hand-crafted rules into the deep learning models. Though the individual components achieve a new state-of-the-art F1-Score, when integrated the best achieved F-measure for the holistic system is 0.89.
机译:从产品数据表中包含的表格结构中提取数据在许多情况下至关重要,特别是在为各个行业服务的供应链的管理和优化中。为了最大程度地减少人为干预,表检测和表结构检测是必不可少的功能。但是,要提取表及其列和行的自包含的整体解决方案并不容易获得。为了应对这一挑战,本研究提出了一种新的正式程序,该程序由以下顺序组成:表格检测,结构分割和文档上的整体表格结构检测。所提出的表格检测模型通过在公共竞争数据集上实现1.0的召回值和0.99以上的精度,从而优于最新的解决方案。此外,这项工作引入了一种判断机制和一个基于协议的后处理程序,以将手工制定的规则整合到深度学习模型中。尽管各个组件均获得了最新的F1-Score,但集成后,整体系统的最佳F度量为0.89。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号