首页> 外文期刊>Journal of the American Society for Information Science and Technology >A Fast Method Based on Multiple Clustering for Name Disambiguation in Bibliographic Citations
【24h】

A Fast Method Based on Multiple Clustering for Name Disambiguation in Bibliographic Citations

机译:书目引用中基于多重聚类的名称歧义快速处理方法

获取原文
获取原文并翻译 | 示例
           

摘要

Name ambiguity in the context of bibliographic citation affects the quality of services in digital libraries. Previous methods are not widely applied in practice because of their high computational complexity and their strong dependency on excessive attributes, such as institutional affiliation, research area, address, etc., which are difficult to obtain in practice. To solve this problem, we propose a novel coarse-to-fine framework for name disambiguation which sequentially employs 3 common and easily accessible attributes (i.e., coauthor name, article title, and publication venue). Our proposed framework is based on multiple clustering and consists of 3 steps: (a) clustering articles by coauthorship and obtaining rough clusters, that is fragments; (b) clustering fragments obtained in step 1 by title information and getting bigger fragments; (c) and clustering fragments obtained in step 2 by the latent relations among venues. Experimental results on a Digital Bibliography and Library Project (DBLP) data set show that our method outperforms the existing state-of-the-art methods by 2.4% to 22.7% on the average pairwise F1 score and is 10 to 100 times faster in terms of execution time.
机译:书目引文中的名称歧义会影响数字图书馆的服务质量。先前的方法由于其计算复杂度高以及对过多属性(例如机构隶属关系,研究领域,住所等)的强烈依赖而难以在实践中应用,因此在实践中并未得到广泛应用。为了解决这个问题,我们提出了一个新颖的从粗到精的名称歧义框架,该框架依次采用了3个常见且易于访问的属性(即共同作者名称,文章标题和出版地点)。我们提出的框架基于多个聚类,包括3个步骤:(a)通过共同作者对文章进行聚类并获得粗糙的聚类,即片段; (b)根据标题信息将步骤1中获得的片段聚类,并得到更大的片段; (c)以及通过地点之间的潜在关系在步骤2中获得的聚类片段。在数字书目和图书馆计划(DBLP)数据集上的实验结果表明,我们的方法在成对的F1分数上比现有的最新方法高2.4%至22.7%,并且在速度上要快10至100倍执行时间。

著录项

  • 来源
  • 作者单位

    School of Software, Dalian University of Technology, Economy and Technology Development Area, Dalian, 116620, China;

    School of Software, Dalian University of Technology, Economy and Technology Development Area, Dalian, 116620, China;

    School of Software, Dalian University of Technology, Economy and Technology Development Area, Dalian, 116620, China;

    School of Software, Dalian University of Technology, Economy and Technology Development Area, Dalian, 116620, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号