首页> 外文会议>International conference on advanced data mining and applications >Doctoral Advisor or Medical Condition: Towards Entity-Specific Rankings of Knowledge Base Properties
【24h】

Doctoral Advisor or Medical Condition: Towards Entity-Specific Rankings of Knowledge Base Properties

机译:博士生顾问或医疗状况:迈向特定于实体的知识库属性排名

获取原文

摘要

In knowledge bases such as Wikidata, it is possible to assert a large set of properties for entities, ranging from generic ones such as name and place of birth to highly profession-specific or background-specific ones such as doctoral advisor or medical condition. Determining a preference or ranking in this large set is a challenge in tasks such as prioritisa-tion of edits or natural-language generation. Most previous approaches to ranking knowledge base properties are purely data-driven, that is, as we show, mistake frequency for interestingness. In this work, we have developed a human-annotated dataset of 350 preference judgments among pairs of knowledge base properties for fixed entities. From this set, we isolate a subset of pairs for which humans show a high level of agreement (87.5% on average). We show, however, that baseline and state-of-the-art techniques achieve only 61.3% precision in predicting human preferences for this subset. We then develop a technique based on a combination of general frequency, applicability to similar entities and semantic similarity that achieves 74% precision. The preference dataset is available at https://www.kaggle.com/srazniewski/wikidatapropertyranking.
机译:在诸如Wikidata的知识库中,可以为实体声明一大套属性,范围从通用属性(例如名称和出生地)到高度专业的或特定背景的属性(例如医生或医学状况)。在诸如编辑的优先级或自然语言生成之类的任务中,确定大集合中的偏好或排名是一项挑战。以往大多数对知识库属性进行排名的方法都是纯粹由数据驱动的,即,如我们所示,将频率误认为是有趣的。在这项工作中,我们开发了一个带有人类注释的数据集,其中包含针对固定实体的成对知识库属性中的350个偏好判断。从这个集合中,我们分离出人类显示出高度一致性(平均87.5%)的一对子集。但是,我们显示,在预测此子集的人类偏好时,基线和最新技术仅达到61.3%的精度。然后,我们基于通用频率,对相似实体的适用性和语义相似性的组合,开发出一种技术,该技术可实现74%的精度。首选项数据集可从https://www.kaggle.com/srazniewski/wikidatapropertyranking获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号