首页> 外文会议>IEEE International Conference on Big Data >Scalable Document Image Information Extraction with Application to Domain-Specific Analysis

【24h】

Scalable Document Image Information Extraction with Application to Domain-Specific Analysis

机译：可扩展文档图像信息提取及其在领域分析中的应用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Document images are ubiquitous, but existing methods mainly focus on the text reading but not information understanding. In this paper, we propose a novel document image information extraction framework with application to domain-specific analysis. Key gains of our system result from the modularized implementation of the document analysis modules needed for different document analysis problems. Further, we provide an efficient text recognition approach that makes a trade-off between performance and running speed for document images and a novel information extraction method with both visual and semantic information. Our framework is scalable and customizable, and only a few annotations of the keyword-content mapping is needed towards domain-specific document analysis.

机译：文档图像无处不在，但是现有的方法主要集中在文本阅读而不是信息理解上。在本文中，我们提出了一种新颖的文档图像信息提取框架，并将其应用于特定领域的分析。我们系统的主要收益来自不同文档分析问题所需的文档分析模块的模块化实施。此外，我们提供了一种有效的文本识别方法，可以在文档图像的性能和运行速度之间进行权衡，并提供一种兼具视觉和语义信息的新颖信息提取方法。我们的框架具有可扩展性和可自定义性，对于特定于域的文档分析，只需要对关键字-内容映射进行一些注释即可。

著录项

来源
《IEEE International Conference on Big Data 》|2019年|5108-5115|共8页
会议地点
作者
Yingbin Zheng; Shuchen Kong; Wanshan Zhu; Hao Ye;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Text recognition; Feature extraction; Proposals; Information retrieval; Text analysis; Semantics; Image segmentation;

机译：文本识别;特征提取;建议;信息检索;文本分析;语义;图像分割;

相似文献

外文文献
中文文献
专利

1. Transformers-based information extraction with limited data for domain-specific business documents [J] . Minh-Tien Nguyen, Dung Tien Le, Linh Le Engineering Applications of Artificial Intelligence . 2021 ,第Jana期

机译：基于变换器的信息提取，有限的域特定商业文件数据
2. Term extraction from sparse, ungrammatical domain-specific documents [J] . Ashwin Ittoo, Gosse Bouma Expert Systems with Application . 2013 ,第7期

机译：从稀疏，不语法的领域特定文档中提取术语
3. A Structural Analysis Based Feature Extraction Method for OCR System For Myanmar Printed Document Images [J] . Htwe Pa Pa Win, Phyo Thu Thu Khine, KhinNweNi Tun International journal of computer vision and iImage processing . 2012 ,第1期

机译：基于结构分析的缅甸印刷文档图像OCR系统特征提取方法
4. Scalable Document Image Information Extraction with Application to Domain-Specific Analysis [C] . Yingbin Zheng, Shuchen Kong, Wanshan Zhu, IEEE International Conference on Big Data . 2019

机译：可扩展的文档图像信息提取应用于特定域的分析
5. Knowledge extraction and retrieval for domain-specific documents [D] . Wang, Chunye 2015

机译：特定领域文档的知识提取和检索
6. New Analysis Method Application in Metallographic Images through the Construction of Mosaics Via Speeded Up Robust Features and Scale Invariant Feature Transform [O] . Pedro Pedrosa Rebouças Filho, Francisco Diego Lima Moreira, Francisco Geilson de Lima Xavier, 2015

机译：通过增强鲁棒特征和尺度不变特征变换构造马赛克的金相分析新方法
7. Multi-oriented and multi-scaled text character analysis and recognition in graphical documents and their applications to document image retrieval [O] . Roy Partha Pratim 2011

机译：图形文档中的多方位，多尺度文本字符分析与识别及其在文档图像检索中的应用

Scalable Document Image Information Extraction with Application to Domain-Specific Analysis

摘要

著录项

相似文献

相关主题

期刊订阅