首页> 外文期刊>IEICE transactions on information and systems >Topic Representation of Researchers' Interests in a Large-Scale Academic Database and Its Application to Author Disambiguation
【24h】

Topic Representation of Researchers' Interests in a Large-Scale Academic Database and Its Application to Author Disambiguation

机译:大型学术数据库中研究者兴趣的主题表示及其在作者歧义消除中的应用

获取原文
           

摘要

It is crucial to promote interdisciplinary research and recommend collaborators from different research fields via academic database analysis. This paper addresses a problem to characterize researchers' interests with a set of diverse research topics found in a large-scale academic database. Specifically, we first use latent Dirichlet allocation to extract topics as distributions over words from a training dataset. Then, we convert the textual features of a researcher's publications to topic vectors, and calculate the centroid of these vectors to summarize the researcher's interest as a single vector. In experiments conducted on CiNii Articles, which is the largest academic database in Japan, we show that the extracted topics reflect the diversity of the research fields in the database. The experiment results also indicate the applicability of the proposed topic representation to the author disambiguation problem.
机译:促进跨学科研究并通过学术数据库分析推荐来自不同研究领域的合作者至关重要。本文通过在大型学术数据库中发现的一系列不同的研究主题来解决表征研究者兴趣的问题。具体来说,我们首先使用潜在的Dirichlet分配从训练数据集中提取主题作为单词分布。然后,我们将研究者出版物的文本特征转换为主题向量,并计算这些向量的质心,以将研究者的兴趣概括为一个向量。在日本最大的学术数据库CiNii Articles上进行的实验中,我们表明提取的主题反映了数据库中研究领域的多样性。实验结果还表明了所提出的主题表示法对作者消除歧义问题的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号