首页> 外文学位 >An Architecture For Multimodal Information Extraction From Scholarly Documents

【24h】

An Architecture For Multimodal Information Extraction From Scholarly Documents

机译：从学术文献中提取多峰信息的体系结构

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A scholarly paper (journal article, conference proceeding) has both unstructured (text) and semi-structured data sources (tables and figures). An experimental figure such as a line graph is generated from a data table that stores the results of an experiment. Typically that data table is not reported in the paper, hence can not be queried directly. Similarly, a scholarly table reports the results of an experiment but is not structured enough to support anything more than a keyword query.;This dissertation has two contributions. First, we show methods to reduce these semi-structured data sources to structured content that can support factoid queries such as "What is the best precision for Imagenet classification task?" or "What is the best BLEU score for English to Arabic translation?";For the scholarly figures, we report an end to end system. First, we report a batch extractor to extract all figures (including vector graphics) and associated metadata from a document with 81% and 87% accuracy. Next, we report image processing algorithms to detect compound figures with 82% accuracy and classify non-compound figures as line graphs or bar charts with 84% average accuracy. We improve the accuracy for text extraction from raster graphics by 39% and show algorithms to classify the text inside the plots with an average accuracy of 90%. The majority of figures in computer science papers are embedded as vector graphics. While previous work has always extracted them as raster graphics, we show methods to extract them in a vector graphics format, which allows us to scalably separate curves in line graphs with 75% average accuracy. This reduces a line graph to the original data points from which it was generated, allowing the factoid queries. We report a similar architecture for scholarly tables that can reduce the tables to data based triples supporting similar queries.;Finally, we show supervised methods to extract scholarly entities from the text of the paper. Specifically, we show that a non-sequential classifier learning the informativeness of a phrase globally and a sequential classifier learning the same utilizing the local context can be combined to improve the accuracy of the process.

机译：一篇学术论文（期刊论文，会议记录）具有非结构化（文本）和半结构化数据源（表格和图表）。从存储实验结果的数据表中生成实验图（例如折线图）。通常，该数据表未在论文中报告，因此无法直接查询。类似地，一个学术表格报告了一个实验的结果，但是其结构却不足以支持除关键字查询之外的其他任何内容。首先，我们展示了将这些半结构化数据源简化为可以支持事实查询的结构化内容的方法，例如“ Imagenet分类任务的最佳精度是多少？”或“英语到阿拉伯语翻译的最佳BLEU分数是多少？”；对于学术人物，我们报告了一个端到端系统。首先，我们报告一个批处理提取器，以81％和87％的准确性从文档中提取所有图形（包括矢量图形）和相关的元数据。接下来，我们将报告图像处理算法，以检测精度为82％的复合图形，并将非复合图形分类为线形图或条形图，平均精度为84％。我们将从栅格图形中提取文本的准确性提高了39％，并展示了算法以90％的平均准确性对绘图内的文本进行分类。计算机科学论文中的大多数图形都作为矢量图形嵌入。尽管以前的工作始终将它们提取为栅格图形，但我们展示了以矢量图形格式提取它们的方法，这使我们能够按比例缩放线形图中的曲线，平均精度为75％。这样可以将折线图缩减为生成折线图的原始数据点，从而允许进行事实查询。我们报告了一种学术表的相似体系结构，该体系结构可以将表简化为支持类似查询的基于数据的三元组。最后，我们展示了从论文正文中提取学术实体的监督方法。具体来说，我们表明，可以将学习全局短语信息的非顺序分类器和利用局部上下文学习相同短语的顺序分类器组合在一起，以提高过程的准确性。

著录项

作者
Choudhury, Sagnik Ray.;
展开▼
作者单位

The Pennsylvania State University.;

展开▼
授予单位 The Pennsylvania State University.;
学科 Information science.;Computer science.
学位 Ph.D.
年度 2017
页码 133 p.
总页数 133
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Deep Learning-based Extraction of Algorithmic Metadata in Full-Text Scholarly Documents [J] . Iqra Safder, Saeed-Ul Hassan, Anna Visvizi, Information Processing & Management . 2020,第6期

机译：全文学术文档中算法元数据的深度学习提取
2. Implicit Semantics Based Metadata Extraction and Matching of Scholarly Documents [J] . Jiang Congfeng, Liu Junming, Ou Dongyang, Journal of database management . 2018,第2期

机译：基于隐式语义的学术文档元数据提取与匹配
3. Hash-based document extraction in corporate mobile devices using ontological architectures [J] . Tuncay Ercan Scientific Research and Essays . 2011,第2期

机译：使用本体架构在公司移动设备中基于哈希的文档提取
4. Multimodal Alignment of Scholarly Documents and Their Presentations [C] . Bamdad Bahrani, Min-Yen Kan ACM/IEEE-CS joint conference on digital libraries . 2013

机译：学术文献及其表现形式的多峰对齐
5. Semantic Structuring of Scientific Information in Scholarly Documents [D] . ?Giles, C. Lee 2017

机译：学术文件中科学信息的语义结构
6. HL7 document patient record architecture: an XML document architecture based on a shared information model. [O] . R. H. Dolin, L. Alschuler, F. Behlen, 1999

机译：HL7文档患者记录架构：一种基于共享信息模型的XML文档架构。
7. Multimodal alignment of scholarly documents and their presentations [O] . BAMDAD BAHRANI 2013

机译：学术文献及其介绍的多式联运

An Architecture For Multimodal Information Extraction From Scholarly Documents

摘要

著录项

相似文献

相关主题

期刊订阅