首页> 外文会议>Discovery science >Hierarchical Expert Profiling Using Heterogeneous Information Networks
【24h】

Hierarchical Expert Profiling Using Heterogeneous Information Networks

机译:使用异构信息网络的分层专家配置文件

获取原文
获取原文并翻译 | 示例

摘要

Linking an expert to his knowledge areas is still a challenging research problem. The task is usually divided into two steps: identifying the knowledge areas/topics in the text corpus and assign them to the experts. Common approaches for the expert profiling task are based on the Latent Dirichlet Allocation (LDA) algorithm. As a result, they require pre-defining the number of topics to be identified which is not ideal in most cases. Furthermore, LDA generates a list of independent topics without any kind of relationship between them. Expert profiles created using this kind of flat topic lists have been reported as highly redundant and many times either too specific or too general. In this paper we propose a methodology that addresses these limitations by creating hierarchical expert profiles, where the knowledge areas of a researcher are mapped along different granularity levels, from broad areas to more specific ones. For the purpose, we explore the rich structure and semantics of Heterogeneous Information Networks (HINs). Our strategy is divided into two parts. First, we introduce a novel algorithm that can fully use the rich content of an HIN to create a topical hierarchy, by discovering overlapping communities and ranking the nodes inside each community. We then present a strategy to map the knowledge areas of an expert along all the levels of the hierarchy, exploiting the information we have about the expert to obtain an hierarchical profile of topics. To test our proposed methodology, we used a computer science bibliographical dataset to create a star-schema HIN containing publications as star-nodes and authors, keywords and ISI fields as attribute-nodes. We use heterogeneous pointwise mutual information to demonstrate the quality and coherence of our created hierarchies. Furthermore, we use manually labelled data to serve as ground truth to evaluate our hierarchical expert profiles, showcasing how our strategy is capable of building accurate profiles.
机译:将专家链接到他的知识领域仍然是一个充满挑战的研究问题。任务通常分为两个步骤:识别文本语料库中的知识领域/主题,并将其分配给专家。专家配置任务的常见方法基于潜在狄利克雷分配(LDA)算法。结果,它们需要预定义要识别的主题数量,这在大多数情况下并不理想。此外,LDA生成独立主题的列表,它们之间没有任何类型的关系。据报道,使用这种扁平主题列表创建的专家概要文件具有很高的冗余性,而且很多次都过于具体或过于笼统。在本文中,我们提出了一种方法,该方法通过创建分层的专家档案来解决这些局限性,其中研究人员的知识领域沿着从广泛领域到更具体领域的不同粒度级别进行映射。为此,我们探索了异构信息网络(HIN)的丰富结构和语义。我们的策略分为两个部分。首先,我们介绍一种新颖的算法,该算法可以通过发现重叠的社区并对每个社区内部的节点进行排名,从而充分利用HIN的丰富内容来创建主题层次结构。然后,我们提出一种策略,可在层次结构的所有级别上映射专家的知识领域,利用我们拥有的有关专家的信息来获取主题的层次结构。为了测试我们提出的方法,我们使用计算机科学书目数据集创建了一个星型HIN,其中包含作为星型节点和作者,关键词和ISI字段作为属性节点的出版物。我们使用异构的逐点相互信息来证明我们创建的层次结构的质量和连贯性。此外,我们使用人工标记的数据作为基础事实来评估我们的分层专家档案,展示了我们的策略如何能够建立准确的档案。

著录项

  • 来源
    《Discovery science》|2018年|344-360|共17页
  • 会议地点 Limassol(CY)
  • 作者单位

    CRACS INESC TEC, Porto, Portugal,Departamento de Ciencia de Computadores - Faculdade de Ciencias, Universidade do Porto, Porto, Portugal;

    CRACS INESC TEC, Porto, Portugal,Departamento de Ciencia de Computadores - Faculdade de Ciencias, Universidade do Porto, Porto, Portugal;

    CRACS INESC TEC, Porto, Portugal,Departamento de Ciencia de Computadores - Faculdade de Ciencias, Universidade do Porto, Porto, Portugal;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Expert profiling; Topic modelling; Information networks;

    机译:专家分析;主题建模;资讯网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号