首页> 外文会议>International Conference on Machine Learning and Data Mining in Pattern Recognition(MLDM 2005); 20050709-11; Leipzig(DE) >Discovery of Hidden Correlations in a Local Transaction Database Based on Differences of Correlations
【24h】

Discovery of Hidden Correlations in a Local Transaction Database Based on Differences of Correlations

机译:基于相关差异的本地事务数据库中隐藏相关的发现

获取原文
获取原文并翻译 | 示例

摘要

Given a transaction database as a global set of transactions and its sub-database regarded as a local one, we consider a pair of item-sets whose degrees of correlations are higher in the local database than in the global one. If they show high correlation in the local database, they are detectable by some search methods of previous studies. On the other hand, there exist another kind of paired itemsets such that they are not regarded as characteristic and cannot be found by the methods of previous studies but that their degrees of correlations become drastically higher by the conditioning to the local database. We pay much attention to the latter kind of paired itemsets, as such pairs of itemsets can be an implicit and hidden evidence showing that something particular to the local database occurs even though they are not yet realized as characteristic ones. From this viewpoint, we measure paired itemsets by a difference of two correlations before and after the conditioning to the local database, and define a notion of DC pairs whose degrees of differences of correlations are high. As the measure is non-monotonic, we present an algorithm, searching for DC pairs, with some new pruning rules for cutting off hopeless itemsets. We show by an experimental result that potentially significant DC pairs can be actually found for a given database and the algorithm successfully detects such DC pairs.
机译:给定一个交易数据库作为一个全局交易集,并将其子数据库视为本地交易集,我们考虑一对项目集,它们在本地数据库中的相关度高于全局交易集。如果它们在本地数据库中显示出高度相关性,则可以通过先前研究的某些搜索方法来检测到。另一方面,存在另一种成对的项目集,使得它们不被视为特征并且不能通过先前的研究方法找到,但是它们的相关程度由于对本地数据库的条件而大大提高。我们非常注意后一种成对的项目集,因为这样的项目集对可以是隐式和隐藏的证据,表明发生了本地数据库特有的某些事件,即使它们尚未实现为特征项。从这个角度来看,我们通过对本地数据库进行条件处理之前和之后两个相关性的差异来测量配对项目集,并定义相关性差异程度高的DC对的概念。由于该度量是非单调的,因此我们提出一种算法,搜索DC对,并提供一些新的修剪规则以切断无望的项目集。我们通过实验结果表明,可以为给定的数据库实际找到潜在的重要DC对,并且该算法成功检测到了此类DC对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号