Drawing Density Core-Sets from Incomplete Relational Data

机译：从不完整的关系数据绘制密度核心集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Incompleteness is a ubiquitous issue and brings challenges to answer queries with completeness guaranteed. A density core-set is a subset of an incomplete dataset, whose completeness is approximate to the completeness of the entire dataset. Density core-sets are effective mechanisms to estimate completeness of queries on incomplete datasets. This paper studies the problems of drawing density core-sets on incomplete relational data. To the best of our knowledge, there is no such proposal in the past. (1) We study the problems of drawing density core-sets in different requirements, and prove the problems are all NP-Complete whether functional dependencies are given. (2) An efficient approximate algorithm to draw an approximate density core-set is proposed, where an approximate Knapsack algorithm and weighted sampling techniques are employed to select important candidate tuples. (3) Analysis of the proposed approximate algorithm shows the relative error between completeness of the approximate density core-set and that of a density core-set with same size is within a given relative error bound with high probability. (4) Experiments on both real-world and synthetic datasets demonstrate the effectiveness and efficiency of the algorithm.

机译：不完整性是一个无处不在的问题，并带来挑战，以确保完整性回答查询。密度核心集是不完整数据集的子集，其完整性近似于整个数据集的完整性。密度核心集是有效的机制，以估算不完整数据集查询的完整性。本文研究了在不完全关系数据上绘制密度核心集的问题。据我们所知，过去没有这样的建议。（1）我们研究了不同要求的绘制密度核心集的问题，并证明了所有NP完成的问题是否给出了功能依赖性。（2）提出了一种有效的绘制近似密度核心集的近似算法，其中采用近似背包算法和加权采样技术来选择重要的候选元组。（3）提出的近似算法的分析显示了近似密度核心集的完整性之间的相对误差，并且具有相同尺寸的密度核心集的相对误差在具有高概率的给定相对误差中。（4）实际和合成数据集的实验证明了算法的有效性和效率。

著录项

来源
《International conference on database systems for advanced applications》|2017年|xxiii 684 p.|共16页
会议地点
作者
Yongnan Liu; Jianzhong Li; Hong Gao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词
Data quality; Density core-sets; Incomplete data; Query completeness estimation;

机译：数据质量;密度核心集;数据不完整;查询完整性估算;

相似文献

外文文献
中文文献
专利

1. Clustering incomplete relational data using the non-Euclidean relational fuzzy c-means algorithm [J] . Richard J.Hathaway, James C.Bezdek Pattern recognition letters . 2002,第1a3期

机译：使用非欧式关系模糊c均值算法对不完整的关系数据进行聚类
2. BIM INTEROPERABILITY AND RELATIONAL DATABASES INTELLIGENTLY LINKING DRAWINGS AND DATA [J] . Architectural record . 2011,第11期

机译：BIM互操作性和关系数据库将图纸和数据智能地链接在一起
3. Leveraging Node Attributes for Incomplete Relational Data [J] . He Zhao, Lan Du, Wray Buntine JMLR: Workshop and Conference Proceedings . 2017,第3期

机译：利用节点属性获取不完整的关系数据
4. Drawing Density Core-Sets from Incomplete Relational Data [C] . Yongnan Liu, Jianzhong Li, Hong Gao International conference on database systems for advanced applications . 2017

机译：从不完整的关系数据中绘制密度核心集
5. A relational model for incomplete information in temporal databases [D] . Nair, Sunil S. 1993

机译：时间数据库中不完整信息的关系模型
6. Relational Database Structure to Manage High-Density Tissue Microarray Data and Images for Pathology Studies Focusing on Clinical Outcome [O] . Sargum Manley, Neil R. Mucci, Angelo M. De Marzo, 2001

机译：关系数据库结构用于管理针对临床结果的病理研究的高密度组织微阵列数据和图像
7. An approach to extending the relational database model for handling incomplete information and data dependencies. [O] . Hồ Thuần, Hồ Cẩm Hà 2012

机译：扩展用于处理不完整信息和数据依赖性的关系数据库模型的方法。
8. Problem of Incomplete Information in Relational Databases [R] . Grahne, G. 1989

机译：关系数据库中不完整信息的问题

Drawing Density Core-Sets from Incomplete Relational Data

摘要

著录项

相似文献

相关主题

期刊订阅