A Modular Metadata Extraction System for Born-Digital Articles

机译：出生 - 数字文章的模块化元数据提取系统

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a comprehensive system for extracting metadata from scholarly articles. In our approach the entire document is inspected, including headers and footers of all the pages as well as bibliographic references. The system is based on a modular workflow which allows for evaluation, unit testing and replacement of individual components. The workflow is optimized towards processing of born-digital documents, but may accept scanned document images as well. The machine-learning approaches we have chosen for solving individual tasks increase the ability to adapt to new document layouts and formats. The evaluation tests we have performed showed good results of the individual implementations and the entire metadata extraction process.

机译：我们为从学术文章中提取元数据提出了一个全面的系统。在我们的方法中，检查整个文件，包括所有页面的页眉和页脚以及书目引用。该系统基于模块化工作流，其允许评估，单元测试和更换各个组件。工作流程经过优化朝向生于数字文档的处理，但也可以接受扫描的文档图像。我们选择解决个人任务的机器学习方法增加了适应新文档布局和格式的能力。我们执行的评估测试显示了各个实施方式的良好结果和整个元数据提取过程。

著录项

来源
《IAPR International Workshop on Document Analysis Systems》|2012年||共6页
会议地点
作者
Tkaczyk D.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391-53;
关键词

相似文献

外文文献
中文文献
专利

1. Enhanced metadata modelling and extraction methods to acquire contextual pedagogical information from e-learning contents for personalised learning systems [J] . Pal Saurabh, Pramanik Pijush Kanti Dutta, Choudhury Prasenjit Multimedia Tools and Applications . 2021,第16期

机译：增强的元数据建模和提取方法从用于个性化学习系统的电子学习内容获取上下文教学信息
2. Automatic extraction of metadata from scientific publications for CRIS systems [J] . Kova?evi? A., Ivanovi? D., Milosavljevi? B., Program: Automated Library and Information Systems . 2011,第4期

机译：自动从CRIS系统的科学出版物中提取元数据
3. Duality and modularity in elliptic integrable systems and vacua of $N = 1^{?}$ $$ mathcal{N}={1}^{st } $$ gauge theories [J] . Antoine Bourget, Jan Troost The journal of high energy physics . 2015,第4期

机译： $n< Mo> =1？$$ mathcal {n} = {1} ^ { ast} $$ 仪表理论$
4. A Modular Metadata Extraction System for Born-Digital Articles [C] . Tkaczyk D. Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on . 2012

机译：用于数字出版物的模块化元数据提取系统
5. Information Extraction and Metadata Annotation for Algorithms in Digital Libraries [D] . Tuarob, Suppawong. 2015

机译：数字图书馆中算法的信息提取和元数据注释
6. A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records [O] . Tasnia Tahsin, Davy Weissenbacher, Robert Rivera, 2016

机译：一种基于规则的高精度提取系统用于扩展GenBank记录中的地理空间元数据
7. A modular metadata extraction system for born-digital articles [O] . Tkaczyk, Dominika, Bolikowski, Łukasz, Czeczko, Artur, 2012

机译：用于出生数字商品的模块化元数据提取系统

A Modular Metadata Extraction System for Born-Digital Articles

摘要

著录项

相似文献

相关主题

期刊订阅