首页> 外文会议> >Web-based Citation Parsing, Correction and Augmentation
【24h】

Web-based Citation Parsing, Correction and Augmentation

机译:基于Web的引文解析,更正和增强

获取原文

摘要

Considering the tremendous value of citation metadata, many methods have been proposed to automate Citation Metadata Extraction (CME). The existing methods primarily rely on the content analysis of citation text. However, the results from such content-based methods are often unreliable. Moreover, the extracted citation metadata is only a small part of the relevant metadata that spreads across the Internet. As opposed to the content-based CME methods, this paper proposes a Web-based CME approach and a citation enriching system, called as BibAll, which is capable of correcting the parsing results of content-based CME methods and augmenting citation metadata by leveraging relevant bibliographic data from digital repositories and cited-by publications on the Web. BibAll consists of four main components: citation parsing, Web-based bibliographic data retrieval, irrelevant bibliographic data filtering, and relevant bibliographic data integration. The system has been tested on the publicly available FLUX-CIM dataset. Experimental results show that BibAll significantly improves the citation parsing accuracy and augments the metadata of the original citation.
机译:考虑到引文元数据的巨大价值,已经提出了许多方法来自动执行引文元数据提取(CME)。现有方法主要依靠引文的内容分析。但是,这种基于内容的方法的结果通常是不可靠的。此外,提取的引用元数据只是在Internet上传播的相关元数据的一小部分。与基于内容的CME方法相反,本文提出了一种基于Web的CME方法和一个称为BibAll的引文丰富系统,该系统能够更正基于内容的CME方法的解析结果并通过利用相关信息来增强引文元数据数字资料库中的书目数据,以及网络上被引用的出版物。 BibAll由四个主要部分组成:引文解析,基于Web的书目数据检索,不相关的书目数据过滤和相关的书目数据集成。该系统已经在公开可用的FLUX-CIM数据集上进行了测试。实验结果表明,BibAll显着提高了引文解析的准确性,并增加了原始引文的元数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号