首页> 外文期刊>ACM transactions on the web >Canonical Forms for Isomorphic and Equivalent RDF Graphs: Algorithms for Leaning and Labelling Blank Nodes
【24h】

Canonical Forms for Isomorphic and Equivalent RDF Graphs: Algorithms for Leaning and Labelling Blank Nodes

机译:同构和等效RDF图的规范形式:倾斜和标记空白节点的算法

获取原文

摘要

Existential blank nodes greatly complicate a number of fundamental operations on Resource Description Framework (RDF) graphs. In particular, the problems of determining if two RDF graphs have the same structure modulo blank node labels (i.e., if they are isomorphic), or determining if two RDF graphs have the same meaning under simple semantics (i.e., if they are simple-equivalent), have no known polynomial-time algorithms. In this article, we propose methods that can produce two canonical forms of an RDF graph. The first canonical form preserves isomorphism such that any two isomorphic RDF graphs will produce the same canonical form; this iso-canonical form is produced by modifying the well-known canonical labelling algorithm NAUTY for application to RDF graphs. The second canonical form additionally preserves simple-equivalence such that any two simple-equivalent RDF graphs will produce the same canonical form; this equi-canonical form is produced by, in a preliminary step, leaning the RDF graph, and then computing the iso-canonical form. These algorithms have a number of practical applications, such as for identifying isomorphic or equivalent RDF graphs in a large collection without requiring pairwise comparison, for computing checksums or signing RDF graphs, for applying consistent Skolemisation schemes where blank nodes are mapped in a canonical manner to Internationalised Resource Identifiers (IRIs), and so forth. Likewise a variety of algorithms can be simplified by presupposing RDF graphs in one of these canonical forms. Both algorithms require exponential steps in the worst case; in our evaluation we demonstrate that there indeed exist difficult synthetic cases, but we also provide results over 9.9 million RDF graphs that suggest such cases occur infrequently in the real world, and that both canonical forms can be efficiently computed in all but a handful of such cases.
机译:现有的空白节点极大地复杂了资源描述框架(RDF)图上的许多基本操作。特别地,确定两个RDF图是否具有相同的结构(以空白节点标签为模)的问题(即,如果它们是同构的),或者确定两个RDF图在简单语义下是否具有相同的含义(即,它们是否是简单等效的)的问题),没有已知的多项式时间算法。在本文中,我们提出了可以产生RDF图的两种规范形式的方法。第一种规范形式保留了同构性,因此任何两个同构RDF图将产生相同的规范形式。通过修改众所周知的规范标注算法NAUTY并将其应用于RDF图,可以生成这种等规范形式。第二种规范形式还保留了简单等价性,以便任何两个简单等价的RDF图都将产生相同的规范形式。通过在初始步骤中倾斜RDF图,然后计算等经典形式,可以生成这种等经典形式。这些算法具有许多实际应用,例如用于识别大型集合中的同构或等效RDF图而无需成对比较,计算校验和或对RDF图进行签名,应用一致的Skolemisation方案,其中空白节点以规范的方式映射到国际化资源标识符(IRI)等。同样,可以通过以这些规范形式中的一种预设RDF图来简化各种算法。在最坏的情况下,两种算法都需要指数步长。在我们的评估中,我们证明了确实存在困难的综合案例,但我们还提供了超过990万个RDF图的结果,这些图表明此类案例在现实世界中很少出现,并且除了少数几个此类案例外,所有规范形式都可以得到有效计算案件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号