首页> 外文会议>International conference on very large data bases >CYADB: A Database that Covers Your Ask
【24h】

CYADB: A Database that Covers Your Ask

机译:CYADB:涵盖您的询问的数据库

获取原文

摘要

Data completeness is becoming a significant roadblock in data quality. Existing research in this area currently handles the certainty of a query by ignoring the incomplete part and approximating missing attributes on partially complete tuples, but leaves open the question of how the missing data affect the quality of the results. This is particularly challenging when entire tuples are absent, which can affect query certainty in ways that are not immediately obvious. To aid this, we propose cyadb, a database that "covers your ask" by assessing the quality of a query answer when data are missing. cyadb is a human-in-the-loop system, in which the data owner utilizes his or her domain knowledge of data to specify aspects of the missing data, such as where it might be missing ("where"), how many data points are missing ("how many"), and how large the missing data points could be in comparison to the provided data ("how big"). Using this. cyadb calculates the query's missing sensitivity, the maximal size of the effect that the missing data could have on the given query. Additionally, cyadb provides concrete examples of missing data that match the missing sensitivity to help the user interactively refine the provided domain knowledge.
机译:数据完整性正在成为数据质量的重要障碍。该领域中的现有研究当前通过忽略不完整部分并近似估计部分完整元组上的缺失属性来处理查询的确定性,但仍然存在缺失数据如何影响结果质量的问题。当缺少整个元组时,这尤其具有挑战性,因为这可能会以不立即可见的方式影响查询的确定性。为此,我们建议使用cyadb数据库,该数据库通过在数据丢失时评估查询答案的质量来“发现您的问题”。 cyadb是一个在环系统的人,其中数据所有者利用他或她的数据领域知识来指定丢失数据的各个方面,例如可能丢失的位置(“哪里”),多少个数据点丢失(“多少”),以及与提供的数据相比丢失的数据点有多大(“多少”)。使用这个。 cyadb计算查询的缺失敏感性,即缺失数据对给定查询可能产生的影响的最大大小。此外,cyadb还提供了丢失数据的具体示例,这些示例与丢失的敏感度相匹配,以帮助用户以交互方式完善所提供的领域知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号