首页> 外文期刊>Journal of web semantics: >A bootstrapping approach to entity linkage on the Semantic Web
【24h】

A bootstrapping approach to entity linkage on the Semantic Web

机译:语义网上实体链接的自举方法

获取原文
获取原文并翻译 | 示例

摘要

In the Big Data era, ever-increasing RDF data have reached a scale in billions of entities and brought challenges to the problem of entity linkage on the Semantic Web. Although millions of entities, typically denoted by URIs, have been explicitly linked with owl: sameAs, potentially coreferent ones are still numerous. Existing automatic approaches address this problem mainly from two perspectives: one is via equivalence reasoning, which infers semantically coreferent entities but probably misses many potentials; the other is by similarity computation between property-values of entities, which is not always accurate and do not scale well. In this paper, we introduce a bootstrapping approach by leveraging these two kinds of methods for entity linkage. Given an entity, our approach first infers a set of semantically coreferent entities. Then, it iteratively expands this entity set using discriminative property-value pairs. The discriminability is learned with a statistical measure, which does not only identify important property-values in the entity set, but also takes matched properties into account. Frequent property combinations are also mined to improve linkage accuracy. We develop an online entity linkage search engine, and show its superior precision and recall by comparing with representative approaches on a large-scale and two benchmark datasets. (C) 2015 Elsevier B.V. All rights reserved.
机译:在大数据时代,不断增长的RDF数据已达到数十亿个实体的规模,并给语义Web上的实体链接问题带来了挑战。尽管数百万个实体(通常用URI表示)已与owl:sameAs显式链接,但潜在的核心引用实体仍然很多。现有的自动方法主要从两个角度解决这个问题:一个是通过等价推理,该推理从语义上推导了相互关联的实体,但可能错过了许多潜力。另一种是通过实体的属性值之间的相似度计算,这并不总是准确的,并且缩放性不佳。在本文中,我们通过利用这两种实体链接方法介绍一种引导方法。给定一个实体,我们的方法首先推断出一组语义上相互关联的实体。然后,使用区分属性值对迭代地扩展此实体集。可分辨性是通过统计度量学习的,该统计度量不仅识别实体集中的重要属性值,而且还将匹配的属性考虑在内。还可以挖掘频繁的属性组合以提高链接精度。我们开发了一个在线实体链接搜索引擎,并通过与大型和两个基准数据集上的代表性方法进行比较,显示了其卓越的精度和召回率。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号