首页> 外文会议>ESWC 2014;Extended Semantic Web Conference >A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
【24h】

A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles

机译:一种可扩展的方法,用于有效地生成结构化数据集主题配置文件

获取原文

摘要

The increasing adoption of Linked Data principles has led to an abundance of datasets on the Web. However, take-up and reuse is hindered by the lack of descriptive information about the nature of the data, such as their topic coverage, dynamics or evolution. To address this issue, we propose an approach for creating linked dataset profiles. A profile consists of structured dataset metadata describing topics and their relevance. Profiles are generated through the configuration of techniques for resource sampling from datasets, topic extraction from reference datasets and their ranking based on graphical models. To enable a good trade-off between scalability and accuracy of generated profiles, appropriate parameters are determined experimentally. Our evaluation considers topic profiles for all accessible datasets from the Linked Open Data cloud. The results show that our approach generates accurate profiles even with comparably small sample sizes (10%) and outperforms established topic modelling approaches.
机译:随着联系数据原则的增加导致了网络上的丰富数据集。然而,通过缺乏有关数据性质的描述性信息,例如他们的主题覆盖,动态或演化。要解决此问题,我们提出了一种创建链接DataSet配置文件的方法。个人资料包括描述主题及其相关性的结构化数据集元数据。通过配置来自数据集的资源采样的技术的配置生成配置文件,主题提取来自参考数据集及其基于图形模型的排名。为了在产生的简档的可扩展性和准确性之间实现良好的折衷,可以通过实验确定适当的参数。我们的评估考虑了来自链接开放数据云的所有可访问数据集的主题配置文件。结果表明,我们的方法即使具有相对小的样本尺寸(10%)和优于建立的主题建模方法,也可以产生准确的曲线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号