首页> 外文会议>European semantic web conference on semantic web >From Queriability to Informativity, Assessing 'Quality in Use' of DBpedia and YAGO
【24h】

From Queriability to Informativity, Assessing 'Quality in Use' of DBpedia and YAGO

机译:从可追溯性到信息化,评估DBpedia和YAGO的“使用质量”

获取原文
获取外文期刊封面目录资料

摘要

In recent years, an increasing number of semantic data sources have been published on the web. These sources are further interlinked to form the Linking Open Data (LOD) cloud. To make full use of these data sets, it is necessary to learn their data qualities. Researchers have proposed several metrics and have developed numerous tools to measure the qualities of the data sets in LOD from different dimensions. However, there exist few studies on evaluating data set quality from the users' usability perspective and usability has great impacts on the spread and reuse of LOD data sets. On the other hand, usability is well studied in the area of software quality. In the newly published standard ISO/IEC 25010, usability is further broadened to include the notion of "quality in use" besides the other two factors, namely, internal and external. In this paper, we first adapt the notions and the methods used in software quality to assess the data set quality. Second, we formally define two quality dimensions, namely, Queriability and Informativity from the perspective of quality in use. The two proposed dimensions correspond to querying and answering, respectively, which are the most frequent usage scenarios for accessing LOD data sets. Then we provide a series of metrics to measure the two dimensions. Last, we apply the metrics to two representative data sets in LOD (i.e., YAGO and DBpedia). In the evaluating process, we select dozens of questions from both QALD and WebQuestions and ask a group of users to construct queries as well as to check the answers with the help of our usability testing tool. The findings during the assessment not only illustrate the capability of our method and metrics but also give new insights on data quality of the two knowledge bases.
机译:近年来,越来越多的语义数据源已经发布在网络上。这些源进一步相互链接以形成链接开放数据(LOD)云。为了充分利用这些数据集,有必要学习它们的数据质量。研究人员提出了几种度量标准,并开发了许多工具来从不同维度测量LOD中数据集的质量。但是,从用户可用性的角度评估数据集质量的研究很少,可用性对LOD数据集的传播和重用产生了很大影响。另一方面,在软件质量方面对可用性进行了很好的研究。在新发布的标准ISO / IEC 25010中,除了内部和外部两个其他因素之外,可用性进一步扩展为包括“使用质量”的概念。在本文中,我们首先调整软件质量中使用的概念和方法来评估数据集质量。其次,我们从使用质量的角度正式定义两个质量维度,即可追溯性和信息性。提出的两个维度分别对应于查询和应答,这是访问LOD数据集的最常见使用方案。然后,我们提供了一系列衡量两个维度的指标。最后,我们将指标应用于LOD中的两个代表性数据集(即YAGO和DBpedia)。在评估过程中,我们从QALD和WebQuestions中选择了数十个问题,并要求一组用户在我们的可用性测试工具的帮助下构造查询以及检查答案。评估过程中的发现不仅说明了我们的方法和指标的能力,而且还对这两个知识库的数据质量提供了新的见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号