首页> 外文会议>International conference on digital information management >Examination of effective features for CRF-based bibliography extraction from reference strings
【24h】

Examination of effective features for CRF-based bibliography extraction from reference strings

机译:从参考字符串中提取基于CRF的书目的有效特征的检查

获取原文

摘要

Metadata such as bibliographic information about documents are indispensable in the effective use of digital libraries. In particular, the reference fields of academic papers contain much bibliographic information such as authors' names and document titles. We are therefore developing a method for automatically extracting bibliographic information from reference strings using a conditional random field (CRF). The features used by the CRF determine the accuracy of this method. We examine effective features for accurate extraction by experimentally changing the features used. The experiments showed that lexical features were quite effective in accurate extraction and augmenting lexicons properly could lead to further improvements in accuracy.
机译:在有效使用数字图书馆中,元数据(如有关文献的书目信息)是必不可少的。特别是,学术论文的参考领域包含许多书目信息,例如作者的姓名和文件标题。因此,我们正在开发一种使用条件随机字段(CRF)从参考字符串中自动提取书目信息的方法。 CRF使用的功能决定了此方法的准确性。我们通过实验改变所使用的特征来检查有效特征,以进行准确提取。实验表明,词汇特征在准确提取中非常有效,适当地扩充词汇库可以进一步提高准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号