首页> 外文会议>Australasian Joint Conference on Artificial Intelligence >Author Name Disambiguation for Ranking and Clustering PubMed Data Using NetClus
【24h】

Author Name Disambiguation for Ranking and Clustering PubMed Data Using NetClus

机译:作者姓名使用Netclus排名和聚类的歧义

获取原文
获取外文期刊封面目录资料

摘要

The ranking and clustering of publication databases are of-ten used to discover useful information about research areas. NetClus is an iterative algorithm for clustering heterogenous star-schema infor-mation network that incorporates the ranking information of individual data types. The algorithm has been evaluated using the DBLP database. In this paper, we apply NetClus on PubMed, a free database of articles on life sciences and biomedical topics to discover key aspects of cancer research. The absence of unique identifiers for authors in PubMed intro-duces additional challenges. To address this, we introduce an improved author disambiguation technique using affiliation string normalisation based on vector space model together with co-author networks. Our tech-nique for disambiguating authors, which offers a higher accuracy than existing techniques, significantly improves NetClus clustering results.
机译:出版物数据库的排名和聚类为10,用于发现有关研究领域的有用信息。 NetClus是一种迭代算法,用于聚类了包含各个数据类型的排名信息的异构星形模式信息网络。使用DBLP数据库进行了评估算法。在本文中,我们在PubMed上申请Netclus,这是关于生命科学文章的免费数据库,并发现癌症研究的关键方面。在狭义上的作者缺乏唯一标识符,包括额外的挑战。为了解决这个问题,我们使用基于矢量空间模型的联轴串标准化与共同作者网络一起介绍了一种改进的作者消除歧义技术。我们为消除歧义作者的技术,提供比现有技术更高的准确性,显着提高了NetClus聚类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号