首页> 外文会议>ACM/IEEE-CS joint conference on digital libraries >Evaluation of Header Metadata Extraction Approaches and Tools for Scientific PDF Documents
【24h】

Evaluation of Header Metadata Extraction Approaches and Tools for Scientific PDF Documents

机译:评估科学PDF文档的标题元数据提取方法和工具

获取原文

摘要

This paper evaluates the performance of tools for the extraction of metadata from scientific articles. Accurate metadata extraction is an important task for automating the management of digital libraries. This comparative study is a guide for developers looking to integrate the most suitable and effective metadata extraction tool into their software. We shed light on the strengths and weaknesses of seven tools in common use. In our evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop. SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.
机译:本文评估了从科学文章中提取元数据的工具的性能。准确的元数据提取是自动化数字图书馆管理的重要任务。这项比较研究为寻求将最合适和最有效的元数据提取工具集成到其软件中的开发人员提供了指南。我们阐明了七个常用工具的优缺点。在使用arXiv集合中的论文进行的评估中,GROBID获得了最佳结果,其次是Mendeley Desktop。根据要提取的元数据类型,SciPlore Xtract,PDFMeat和SVMHeaderParse也提供了良好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号