...
首页> 外文期刊>Journal of Intelligent Information Systems >Holistic primary key and foreign key detection
【24h】

Holistic primary key and foreign key detection

机译:整体主键和外键检测

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Primary keys (PKs) and foreign keys (FKs) are important elements of relational schemata in various applications, such as query optimization and data integration. However, in many cases, these constraints are unknown or not documented. Detecting them manually is time-consuming and even infeasible in large-scale datasets. We study the problem of discovering primary keys and foreign keys automatically and propose an algorithm to detect both, namely Holistic Primary Key and Foreign Key Detection (HoPF). PKs and FKs are subsets of the sets of unique column combinations (UCCs) and inclusion dependencies (INDs), respectively, for which efficient discovery algorithms are known. Using score functions, our approach is able to effectively extract the true PKs and FKs from the vast sets of valid UCCs and INDs. Several pruning rules are employed to speed up the procedure. We evaluate precision and recall on three benchmarks and two real-world datasets. The results show that our method is able to retrieve on average 88% of all primary keys, and 91% of all foreign keys. We compare the performance of HoPF with two baseline approaches that both assume the existence of primary keys.
机译:主键(PKS)和外键(FKS)是各种应用中关系模式的重要元素,例如查询优化和数据集成。但是,在许多情况下,这些约束是未知或未记录的。在大规模数据集中,手动检测到它们是耗时且甚至不可行。我们研究了自动发现主键和外键的问题,并提出了一种算法来检测,即整体初级键和外键检测(HOPF)。 PKS和FKS分别是唯一列组合(UCC)和包含依赖性(INDS)集的子集,其中有效的发现算法是已知的。使用分数函数,我们的方法能够有效地从大型有效的UCC和IND中提取真正的PKS和FKS。采用了几种修剪规则来加快程序。我们在三个基准和两个现实世界数据集中评估精度并召回。结果表明,我们的方法能够平均检索所有主键的88%,以及所有外键的91%。我们将Hopf的性能与两个基线方法进行比较,这两种基线方法都认为是主要键的存在。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号