Deep Reader: Information Extraction from Document Images via Relation Extraction and Natural Language

机译：深度阅读器：通过关系提取和自然语言从文档图像中提取信息

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent advancements in the area of Computer Vision with state-of-art Neural Networks has given a boost to Optical Character Recognition (OCR) accuracies. However, extracting characters/text alone is often insufficient for relevant information extraction as documents also have a visual structure that is not captured by OCR. Extracting information from tables, charts, footnotes, boxes, headings and retrieving the corresponding structured representation for the document remains a challenge and finds application in a large number of real-world use cases. In this paper, we propose a novel enterprise based end-to-end framework called DeepReader which facilitates information extraction from document images via identification of visual entities and populating a meta relational model across different entities in the document image. The model schema allows for an easy to understand abstraction of the entities detected by the deep vision models and the relationships between them. DeepReader has a suite of state-of-the-art vision algorithms which are applied to recognize handwritten and printed text, eliminate noisy effects, identify the type of documents and detect visual entities like tables, lines and boxes. Deep Reader maps the extracted entities into a rich relational schema so as to capture all the relevant relationships between entities (words, textboxes, lines etc.) detected in the document. Relevant information and fields can then be extracted from the document by writing SQL queries on top of the relationship tables. A natural language based interface is added on top of the relationship schema so that a non-technical user, specifying the queries in natural language, can fetch the information with minimal effort. In this paper, we also demonstrate many different capabilities of Deep Reader and report results on a real-world use case.

机译：先进的神经网络在计算机视觉领域的最新进展促进了光学字符识别（OCR）的准确性。但是，仅提取字符/文本通常不足以进行相关信息提取，因为文档还具有OCR无法捕获的视觉结构。从表格，图表，脚注，方框，标题中提取信息并检索文档的相应结构化表示仍然是一项挑战，并在大量实际使用案例中找到了应用。在本文中，我们提出了一种新颖的基于企业的端到端框架，称为DeepReader，该框架可通过识别可视实体并在文档图像中的不同实体之间填充元关系模型来促进从文档图像中提取信息。该模型架构允许轻松理解由深度视觉模型检测到的实体及其之间的关系的抽象。 DeepReader拥有一套最先进的视觉算法，可用于识别手写和打印的文本，消除噪声影响，识别文档的类型以及检测诸如表格，线条和盒子之类的视觉实体。深度阅读器将提取的实体映射到丰富的关系模式中，以捕获在文档中检测到的实体（单词，文本框，行等）之间的所有相关关系。然后，可以通过在关系表之上编写SQL查询来从文档中提取相关信息和字段。在关系模式的顶部添加了基于自然语言的界面，以便非技术用户使用自然语言指定查询时，可以以最小的努力获取信息。在本文中，我们还演示了Deep Reader的许多不同功能，并在实际用例中报告结果。

著录项

来源
《Asian conference on computer vision;Workshop on scene understanding and modelling;Workshop on learning and inference methods for high-performance imaging;Workshop on attention/intention understanding;Workshop on museum exhibit identification challenge (open MIC) for domain adaptation and few-shot learning;Workshop on RGB-D-sensing and understanding via combined color and depth;Workshop on dense 3D reconstruction for dynamic scenes;Workshop on AI aesthetics in art and media;International workshop on robust reading;Workshop on artificial intelligence for retinal image anaslysis;Workshop on combining vision and language;International workshop on advanced machine vision for real-life and industrially relevant applications》|2018年|186-201|共16页
会议地点 Perth(AU)
作者
D. Vishwanath; Rohit Rahul; Gunjan Sehgal; Swati; Arindam Chowdhury; Monika Sharma; Lovekesh Vig; Gautam Shroff; Ashwin Srinivasan;
展开▼
作者单位

TCS Research New Delhi India;

BITS Pilani Goa Campus Pilani India;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Concept Relation Extraction from Construction Documents Using Natural Language Processing [J] . Mohammed Al Qady, Amr Kandil Journal of Construction Engineering and Management . 2010,第3期

机译：使用自然语言处理从施工文件中提取概念关系
2. Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language [J] . Wang Dongyang, Su Junli, Yu Hongbin Quality Control, Transactions . 2020,第期

机译：深度学习英语语言自然语言处理的特征提取与分析
3. Rule Based Chunk Extraction from PDF Documents Using Regular Expressions and Natural Language Processing [J] . Amol Rajaram Karad, Rahul Raghvendra Joshi International journal of computational intelligence research . 2021,第1期

机译：使用正则表达式和自然语言处理从PDF文档的规则的块提取
4. Deep Reader: Information Extraction from Document Images via Relation Extraction and Natural Language [C] . D. Vishwanath, Rohit Rahul, Gunjan Sehgal, Asian Conference on Computer Vision . 2019

机译：深度读者：通过关系提取和自然语言从文档图像提取信息
5. Concept relation extraction using natural language processing - The CRISP technique. [D] . Al Qady, Mohammed Abdelrahman. 2008

机译：使用自然语言处理的概念关系提取-CRISP技术。
6. Clinical Data Extraction and Normalization of Cyrillic Electronic Health Records Via Deep-Learning Natural Language Processing [O] . Boyang Zhao -1

机译：通过深度学习自然语言处理对西里尔电子健康记录进行临床数据提取和标准化
7. Vocabulary extraction in foreign-language journals: how natural language processing can help readers [O] . Jilan Sun 2013

机译：外语期刊中的词汇提取：自然语言处理如何帮助读者

Deep Reader: Information Extraction from Document Images via Relation Extraction and Natural Language

摘要

著录项

相似文献

相关主题

期刊订阅