首页> 外文会议>ACM/IEEE-CS joint conference on digital libraries >Extracting and Matching Authors and Affiliations in Scholarly Documents
【24h】

Extracting and Matching Authors and Affiliations in Scholarly Documents

机译:提取和匹配学术文献中的作者和隶属关系

获取原文

摘要

We introduce Enlil, an information extraction system that discovers the institutional affiliations of authors in scholarly papers. Enlil consists of two steps: one that first identifies authors and affiliations using a conditional random field; and a second support vector machine that connects authors to their affiliations. We benchmark Enlil in three separate experiments drawn from three different sources: the ACL Anthology, the ACM Digital Library, and a set of cross-disciplinary scientific journal articles acquired by querying Google Scholar. Against a state-of-the-art production baseline, Enlil reports a statistically significant improvement in F_1 of nearly 10% (p << 0.01). In the case of multidisciplinary articles from Google Scholar, Enlil is benchmarked over both clean input (F_1 > 90%) and automatically-acquired input (F_1 > 80%). We have deployed Enlil in a case study involving Asian genomics research publication patterns to understand how government sponsored collaborative links evolve. Enlil has enabled our team to construct and validate new metrics to quantify the facilitation of research as opposed to direct publication.
机译:我们介绍Enlil,这是一个信息提取系统,可在学术论文中发现作者的机构隶属关系。 Enlil包含两个步骤:一个步骤是首先使用条件随机字段来标识作者和从属关系;第二个步骤是:第二个支持向量机,将作者连接到他们的单位。我们在三个独立的实验中对Enlil进行了基准测试,这些实验来自三个不同的来源:ACL文集,ACM数字图书馆以及通过查询Google Scholar获得的一组跨学科的科学期刊文章。与最新的生产基准相比,Enlil报告了F_1的统计上显着改善,接近10%(p <0.01)。对于来自Google Scholar的多学科文章,Enlil会以干净输入(F_1> 90%)和自动获取输入(F_1> 80%)为基准。我们在涉及亚洲基因组学研究出版模式的案例研究中部署了Enlil,以了解政府赞助的协作链接如何演变。 Enlil使我们的团队能够构建和验证新的指标,以量化对研究的促进,而不是直接发表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号