首页> 外文期刊>Journal of web semantics: >Uncovering hidden semantics of set information in knowledge bases
【24h】

Uncovering hidden semantics of set information in knowledge bases

机译:在知识库中揭开设置信息的隐藏语义

获取原文
获取原文并翻译 | 示例
       

摘要

Knowledge Bases (KBs) contain a wealth of structured information about entities and predicates. This paper focuses on set-valued predicates, i.e., the relationship between an entity and a set of entities. In KBs, this information is often represented in two formats: (i) via counting predicates such as number Of Children and staff Size, that store aggregated integers, and (ii) via enumerating predicates such as parent Of and works For, that store individual set memberships. Both formats are typically complementary: unlike enumerating predicates, counting predicates do not give away individuals, but are more likely informative towards the true set size, thus this coexistence could enable interesting applications in question answering and KB curation.In this paper we aim at uncovering this hidden knowledge. We proceed in two steps. (i) We identify set-valued predicates from a given KB predicates via statistical and embedding-based features. (ii) We link counting predicates and enumerating predicates by a combination of co-occurrence, correlation and textual relatedness metrics. We analyse the prevalence of count information in four prominent knowledge bases, and show that our linking method achieves up to 0.55 F1 score in set predicate identification versus 0.40 F1 score of a random selection, and normalized discounted gains of up to 0.84 at position 1 and 0.75 at position 3 in relevant predicate alignments. Our predicate alignments are showcased in a demonstration system available at https://counqer.mpi-inf.mpg.de/spo. (C) 2020 Elsevier B.V. All rights reserved.
机译:知识库(KBS)包含有关实体和谓词的大量结构化信息。本文侧重于集价值谓词,即实体与一组实体之间的关系。在KBS中,此信息通常以两种格式表示:(i)通过计数诸如儿童和员工大小的谓词,该谓词和(ii)通过枚举诸如父级和工作的父级序列(II),该序列设置成员资格。这两种格式通常是互补的:与枚举谓词不同,计数谓词不泄露个人,但更可能朝着真实集大小的信息,因此这种共存可以实现有关应答和KB策划的有趣应用程序。我们的目的是揭示这种隐藏的知识。我们分两步进行。 (i)我们通过统计和基于嵌入的特征从给定的KB谓词识别来自给定KB谓词的集价值谓词。 (ii)通过共同发生,相关性和文本相关度指标的组合,链接计数谓词和枚举谓词。我们分析了四个突出知识库中的计数信息的普遍性,并表明我们的链接方法在设定的谓词识别方面取得了高达0.55 F1分数,而是随机选择的0.40 F1评分,并且在1位高达0.84的标准化折扣收益0.75在相关谓词对准中的位置3。我们的谓词对齐在HTTPS://counqer.mpi-inf.mpg.de/spo上提供的演示系统中展示。 (c)2020 Elsevier B.v.保留所有权利。

著录项

  • 来源
    《Journal of web semantics:》 |2020年第10期|100588.1-100588.13|共13页
  • 作者单位

    Max Planck Inst Informat Saarland Informat Campus D-66123 Saarbrucken Germany;

    Max Planck Inst Informat Saarland Informat Campus D-66123 Saarbrucken Germany;

    Max Planck Inst Informat Saarland Informat Campus D-66123 Saarbrucken Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号