Extracting and Matching Authors and Affiliations in Scholarly Documents

机译：提取和匹配学术文献中的作者和隶属关系

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce Enlil, an information extraction system that discovers the institutional affiliations of authors in scholarly papers. Enlil consists of two steps: one that first identifies authors and affiliations using a conditional random field; and a second support vector machine that connects authors to their affiliations. We benchmark Enlil in three separate experiments drawn from three different sources: the ACL Anthology, the ACM Digital Library, and a set of cross-disciplinary scientific journal articles acquired by querying Google Scholar. Against a state-of-the-art production baseline, Enlil reports a statistically significant improvement in F_1 of nearly 10% (p ＜＜ 0.01). In the case of multidisciplinary articles from Google Scholar, Enlil is benchmarked over both clean input (F_1 ＞ 90%) and automatically-acquired input (F_1 ＞ 80%). We have deployed Enlil in a case study involving Asian genomics research publication patterns to understand how government sponsored collaborative links evolve. Enlil has enabled our team to construct and validate new metrics to quantify the facilitation of research as opposed to direct publication.

机译：我们介绍Enlil，这是一个信息提取系统，可在学术论文中发现作者的机构隶属关系。 Enlil包含两个步骤：一个步骤是首先使用条件随机字段来标识作者和从属关系；第二个步骤是：第二个支持向量机，将作者连接到他们的单位。我们在三个独立的实验中对Enlil进行了基准测试，这些实验来自三个不同的来源：ACL文集，ACM数字图书馆以及通过查询Google Scholar获得的一组跨学科的科学期刊文章。与最新的生产基准相比，Enlil报告了F_1的统计上显着改善，接近10％（p ＜0.01）。对于来自Google Scholar的多学科文章，Enlil会以干净输入（F_1＞ 90％）和自动获取输入（F_1＞ 80％）为基准。我们在涉及亚洲基因组学研究出版模式的案例研究中部署了Enlil，以了解政府赞助的协作链接如何演变。 Enlil使我们的团队能够构建和验证新的指标，以量化对研究的促进，而不是直接发表。

著录项

来源
《ACM/IEEE-CS joint conference on digital libraries》|2013年|219-228|共10页
会议地点 Indianapolis IN(US)
作者
Huy Hoang Nhat Do; Muthu Kumar Chandrasekaran; Philip S. Cho; Min-Yen Kan;
展开▼
作者单位

Department of Computer Science School of Computing National University of Singapore Asia Research Institute National University of Singapore;

Asia Research Institute National University of Singapore;

Department of Computer Science School of Computing National University of Singapore NUS Interactive and Digital Media Institute National University of Singapore;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Metadata Extraction; Logical Structure Discovery; Conditional Random Fields; Support Vector Machine; Rich Document Features;

机译：元数据提取；逻辑结构发现；条件随机场；支持向量机丰富的文档功能;

相似文献

外文文献
中文文献
专利

1. Implicit Semantics Based Metadata Extraction and Matching of Scholarly Documents [J] . Jiang Congfeng, Liu Junming, Ou Dongyang, Journal of database management . 2018,第2期

机译：基于隐式语义的学术文档元数据提取与匹配
2. Joint learning of author and citation contexts for computing drift in scholarly documents [J] . Vijayarani J., Geetha T. V. International journal of machine learning and cybernetics . 2021,第6期

机译：在学术文件中计算漂移的作者和引用语境的联合学习
3. Authoring social reality with documents: From authorship of documents and documentary boundary objects to practical authorship [J] . Huvila Isto The Journal of Documentation . 2019,第1期

机译：用文档创作社会现实：从文档的创作和文档的边界对象到实际的创作
4. Extracting and Matching Authors and Affiliations in Scholarly Documents [C] . Huy Hoang Nhat Do, Muthu Kumar Chandrasekaran, Philip S. Cho, ACM/IEEE-CS joint conference on digital libraries . 2013

机译：提取和匹配学术文件中的作者和附属机构
5. Estimation of the Number of Authors of a Multi-Author Document [D] . Leibowitz, Caleb 2015

机译：估计多作者文档的作者数量
6. ERRATUM: Correction for affiliation of the 8th author-corresponding author. Multicenter nonrandomized trial of ramosetron versus palonosetron in controlling chemotherapy-induced nausea and vomiting for colorectal cancer [O] . Jin Soo Kim, Ji Yeon Kim, Sang-Jeon Lee, 2014

机译：勘误：更正第8作者通讯作者的从属关系。雷莫司琼与帕洛诺司琼在控制大肠癌化疗引起的恶心和呕吐中的多中心非随机试验
7. Extracting and Matching Authors and Affiliations in Scholarly Documents [O] . Huy Hoang, Nhat Do, Muthu Kumar Ch, 2013

机译：提取和匹配学术文献中的作者和隶属关系

Extracting and Matching Authors and Affiliations in Scholarly Documents

摘要

著录项

相似文献

相关主题

期刊订阅