首页> 外文期刊>Engineering Applications of Artificial Intelligence >Discovery of hidden correlations in a local transaction database based on differences of correlations
【24h】

Discovery of hidden correlations in a local transaction database based on differences of correlations

机译:根据相关性差异在本地交易数据库中发现隐藏的相关性

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Given a transaction database as a global set of transactions and its local database obtained by some conditioning of the global database, we consider pairs of itemsets whose degrees of correlation are higher in the local database than in the global one. A problem of finding paired itemsets with high correlation in one database is already known as discovery of correlation, and has been studied as the highly correlated itemsets are characteristic in the database. However, even noncharacteristic paired itemsets are also meaningful provided the degree of correlation increases significantly in the local database compared with the global one. They can be implicit and hidden evidences showing that something particular to the local database occurs, even though they were not previously realized to be characteristic. From this viewpoint, we have proposed measurement of the significance of paired itemsets by the difference of two correlations before and after the conditioning of the global database, and have defined a notion of DC pairs, whose degrees of difference of correlation are high. In this paper, we develop an algorithm for mining DC pairs and apply it to a transaction database with time stamp data. The problem of finding DC pairs for large databases is computationally hard in general, as the algorithm has to check even noncharacteristic paired itemsets. However, we show that our algorithm equipped with some pruning rules works successfully to find DC pairs that may be significant.
机译:给定一个交易数据库作为交易的全局集合,并通过全局数据库的某种条件获得其本地数据库,我们考虑成对的项目集,它们在本地数据库中的相关度高于全局数据库中的相关度。在一个数据库中找到具有高相关性的配对项目集的问题已被称为相关性发现,并且由于高度相关的项目集是数据库中的特征,因此已经进行了研究。但是,与本地数据库相比,本地数据库中的相关程度显着提高,即使是非特性配对项目集也很有意义。它们可能是隐式和隐藏的证据,表明发生了本地数据库特有的某些事情,即使以前并未意识到它们是特征性的。从这个观点出发,我们提出了通过对全局数据库进行处理之前和之后的两个相关性的差异来测量配对项目集的重要性的方法,并定义了相关性程度高的DC对的概念。在本文中,我们开发了一种用于挖掘DC对的算法,并将其应用于带有时间戳数据的事务数据库。通常,在大型数据库中查找DC对的问题在计算上比较困难,因为该算法甚至必须检查非特征性的配对项目集。但是,我们表明,配备了一些修剪规则的算法可以成功地找到可能很重要的DC对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号