LiQuate-Estimating the Quality of Links in the Linking Open Data Cloud

机译：液化估计链接开放数据云中的链接质量

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

During the last years, RDF datasets from almost any knowledge domain have been published in the Linking Open Data (LOD) cloud. The Linked Open Data guidelines establish the conditions to be satisfied by resources in order to be included as part of the LOD cloud, as well as connected to previously published data. The process of publication and linkage of resources in the LOD cloud relies on: i) data cleaning and transformation into existing RDF formats, ii) storage of the data into RDF storage systems, and iii) data interlinking. Because of data source heterogeneity, generated RDF data may be ambiguous and links may be incomplete with respect to this data. Users of the Web of Data require linked data to meet high quality standards in order to develop applications that can produce trustworthy results, but data in the LOD cloud has not been curated; thus, tools are necessary to detect data quality problems. For example, researchers that study Life Sciences datasets to explain phenomena or identify anomalies, demand that their findings correspond to current discoveries, and not to the effect of low data quality standards of completeness or redundancy. In this paper we propose LiQuate, a system that uses Bayesian networks to study the incompleteness of links, and ambiguities between labels and between links in the LOD cloud, and can be applied to any domain. Additionally, a probabilistic rule-based system is used to infer new links that associate equivalent resources, and allow to resolve the ambiguities and incompleteness identified during the exploration of the Bayesian network. As a proof of concept, we applied LiQuate to existing Life Sciences linked datasets, and detected ambiguities in the data, that may compromise the confidence of the results of applications such as link prediction or pattern discovery. We illustrate a variety of identified problems and propose a set of enriched intra- and inter-links that may improve the quality of data items and links of specific datasets of the LOD cloud

机译：在过去几年中，几乎任何知识域的RDF数据集已在链接开放数据（LOD）云中发布。链接的开放式数据指南建立了资源满足的条件，以便作为LOD云的一部分包含，以及连接到以前发布的数据。 LOD云中资源的发布过程和链接依靠：i）数据清理和转换成现有的RDF格式，ii）将数据存储到RDF存储系统中，以及III）数据互连。由于数据源异质性，所生成的RDF数据可能是模糊的，并且对于该数据来说可能是不完整的。数据网络的用户需要链接数据以满足高质量标准，以便开发可以产生值得信赖的结果的应用程序，但Lod Cloud中的数据尚未愈合;因此，需要工具来检测数据质量问题。例如，研究生命科学数据集的研究人员解释现象或识别异常，要求他们的发现对应于当前的发现，而不是低数据质量标准的完整性或冗余的影响。在本文中，我们提出了一种使用贝叶斯网络来研究链接的不完整性的系统，以及标签之间的模糊，并且可以应用于任何域的链路之间。此外，基于概率规则的系统用于推断将同等资源相关联的新链接，并允许解决在探索贝叶斯网络期间确定的歧义和不完整性。作为概念证明，我们将水合物应用于现有的生命科学链接数据集，并检测到数据中的歧义，这可能会损害诸如链路预测或模式发现的应用结果的置信度。我们说明了各种所识别的问题，并提出了一套丰富的内部链路，可以提高LOD云的特定数据集的数据项和链接

著录项

来源
《International Workshop on Resource Discovery》|2013年||共27页
会议地点
作者
Edna Ruckhaus; Maria-Esther Vidal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 025.042/5;
关键词
Quality; Linking; Cloud;

机译：质量;链接;云;

相似文献

外文文献
中文文献
专利

1. Linking building data in the cloud: Integrating cross-domain building data using linked data [J] . Edward Curry, James ODonnell, Edward Corry, Advanced engineering informatics . 2013,第2期

机译：在云中链接建筑数据：使用链接的数据集成跨域建筑数据
2. Extending the Linked Data Cloud with Multilingual Lexical Linked Data [J] . Ernesto William De Luca Knowledge Organization . 2013,第5期

机译：用多语言词汇链接数据扩展链接数据云
3. Quality of linked data: Linking the National Hospital Care Survey Data to the National Death Index [J] . Lisa Mirel, Dean Resnick, Scott Campbell, International Journal of Population Data Science . 2018,第4期

机译：联系数据的质量：将国家医院护理调查数据与国家死亡指数联系起来
4. LiQuate-Estimating the Quality of Links in the Linking Open Data Cloud [C] . Edna Ruckhaus, Maria-Esther Vidal International Workshop on Resource Discovery . 2013

机译：液化估计链接开放数据云中的链接质量
5. MOOCLink: Linking and Maintaining Quality of Data Provided by Various MOOC Providers. [D] . Dhekne, Chinmay. 2016

机译：MOOCLink：链接和维护由各种MOOC提供商提供的数据质量。
6. Tasmanian Data Linkage Unit: Supporting innovative research planning and policy formulation in Australia through the provision of high-quality linked-data services. [O] . B Stokes, N Wiggins, T Albion, 2019

机译：塔斯马尼亚数据联动单位：通过提供高质量的联系数据服务支持澳大利亚的创新研究规划和政策制定。
7. Linked Data Demystified: Practical Efforts to Transform CONTENTdm Metadata for the Linked Data Cloud [O] . Southwick Silvia B., Lampert Cory K. 2012

机译：揭秘链接数据：为关联数据云转换CONTENTdm元数据的实际努力
8. Using Linked Data to Evaluate Hospital Charges for Motor Vehicle Crash Victims in211 Pennsylvania. Crash Outcome Data Evaluation System (CODES) Linked Data 211 Demonstration Project [R] . Allen, M., Weiss, H. 1998

机译：使用关联数据评估宾夕法尼亚州211汽车碰撞事故受害者的医院费用。崩溃结果数据评估系统（CODEs）关联数据211示范项目

LiQuate-Estimating the Quality of Links in the Linking Open Data Cloud

摘要

著录项

相似文献

相关主题

期刊订阅