首页> 外文期刊>ACM transactions on the web >Fast and Practical Snippet Generation for RDF Datasets
【24h】

Fast and Practical Snippet Generation for RDF Datasets

机译:用于RDF数据集的快速实用的代码片段生成

获取原文
获取原文并翻译 | 示例

摘要

Triple-structured open data creates value in many ways. However, the reuse of datasets is still challenging. Users feel difficult to assess the usefulness of a large dataset containing thousands or millions of triples. To satisfy the needs, existing abstractive methods produce a concise high-level abstraction of data. Complementary to that, we adopt the extractive strategy and aim to select the optimum small subset of data from a dataset as a snippet to compactly illustrate the content of the dataset. This has been formulated as a combinatorial optimization problem in our previous work. In this article, we design a new algorithm for the problem, which is an order of magnitude faster than the previous one but has the same approximation ratio. We also develop an anytime algorithm that can generate empirically better solutions using additional time. To suit datasets that are partially accessible via online query services (e.g., SPARQL endpoints for RDF data), we adapt our algorithms to trade off quality of snippet for feasibility and efficiency in the Web environment. We carry out extensive experiments based on real RDF datasets and SPARQL endpoints for evaluating quality and running time. The results demonstrate the effectiveness and practicality of our proposed algorithms.
机译:三重结构的开放数据通过多种方式创造价值。但是,数据集的重用仍然具有挑战性。用户感到难以评估包含数千或数百万个三元组的大型数据集的有用性。为了满足需求,现有的抽象方法可以生成简洁的高级数据抽象。作为补充,我们采用提取策略,旨在从数据集中选择最佳的小数据子集作为摘要,以紧凑地说明数据集的内容。在我们以前的工作中,这已被表述为组合优化问题。在本文中,我们为该问题设计了一种新算法,该算法比上一个算法快一个数量级,但是具有相同的近似率。我们还开发了随时可用的算法,可以使用额外的时间来产生经验上更好的解决方案。为了适应可通过在线查询服务部分访问的数据集(例如RDF数据的SPARQL端点),我们调整了算法以权衡代码段的质量,以提高Web环境的可行性和效率。我们基于真实的RDF数据集和SPARQL端点进行了广泛的实验,以评估质量和运行时间。结果证明了我们提出的算法的有效性和实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号