An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations

机译：基于相关性差异的挖掘隐式项目集对算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Given a transaction database as a global set of transactions and its local database obtained by some conditioning to the global one, we consider a pair of itemsets whose degrees of correlations are higher in the local database than in the global one. A problem of finding paired itemsets with high correlation in one database is known as Discovery of Correlation, and some algorithms to search for such characteristic paired itemsets are already proposed. However, even non-characteristic paired itemsets in the local database are also meaningful, provided the degree of correlation increases much higher in the local database than in the global one. They can be an implicit and hidden evidence showing that something particular to the local database occurs even though they are not yet realized as characteristic ones in the local. From this viewpoint, we have already proposed to measure the significance of paired itemsets by the difference of two correlations before and after the conditioning to the local database, and define a notion of DC pairs whose degrees of differences of correlations are high. As DC pairs are regarded as compound itemsets consisting of two component itemsets, we can have two basic strategies for finding them. One strategy firstly examines the compound itemsets and then the components, while another one does the component itemsets and then the compound ones. According to the former strategy, which we have already proposed and tested for its effectiveness, we have to enumerate many number of candidate compound itemsets that cannot be decomposable to components. For this reason, this paper presents a new algorithm according to the second strategy. It firstly enumerate possible component itemsets based on a new pruning rule for cutting off useless components. Secondly it forms the compound itemsets by combining the components thus detected, while we also make use of a constraint for preventing our algorithm from checking meaningless combinations.

机译：鉴于事务数据库作为全局事务集及其本地数据库，通过某些调理到全局将其获得，我们考虑一对项目集，其在本地数据库中的相关程度比全局在全局中更高。在一个数据库中找到具有高相关的配对项集的问题被称为相关性的发现，并且已经提出了一些搜索此类特征成对项目集的算法。然而，即使是本地数据库中的非特征成对项集也是有意义的，所以在本地数据库中的相关程度增加到总体上的相关程度比全局更高。它们可以是一个隐含的和隐藏的证据，表明，即使它们尚未实现当地的特征，它们也会发生某些东西。从这个角度来看，我们已经提出通过在调理到本地数据库之前和之后的两个相关性的差异来测量配对项集的重要性，并定义其相关性程度高的DC对的概念。随着DC对被视为由两个组件项组成的复合项集，我们可以有两个基本策略来查找它们。一项策略首先检查了复合项，然后是组件，而另一个策略是组件项目集，然后是化合物。根据我们已经提出并测试其有效性的前策略，我们必须枚举许多候选人的候选复合项目，这不能分解对组件。因此，本文提出了一种新的算法，根据第二策略。基于切断无用的组件的新修剪规则它首先枚举可能组件项目集。其次，它通过组合如此检测到的组件来形成复合项目集，而我们也使用限制来防止我们的算法检查毫无意义的组合。

著录项

来源
《International Conference on Discovery Science》|2005年||共14页
会议地点
作者
Tsuyoshi Taniguchi; Makoto Haraguchi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Paradigm and performance analysis of distributed frequent itemset mining algorithms based on Mapreduce [J] . Xiao Wen, Hu Juan Microprocessors and microsystems . 2021,第Apra期

机译：基于MapReduce的分布式频繁项目集矿业算法的范例与性能分析
2. Mining interesting infrequent and frequent itemsets based on multiple level minimum supports and minimum correlation strength [J] . Xiangjun Dong, Chuanlu Liu International Journal of Services Technology and Management . 2015,第4a6期

机译：基于多级最小支持和最小关联强度来挖掘有趣的不频繁和频繁项目集
3. CL-MAX: a clustering-based approximation algorithm for mining maximal frequent itemsets [J] . Fatemi Seyed Mohsen, Hosseini Seyed Mohsen, Kamandi Ali, International journal of machine learning and cybernetics . 2021,第2期

机译：CL-MAX：用于采矿最大频繁项目集的基于聚类的近似算法
4. An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations [C] . Tsuyoshi Taniguchi, Makoto Haraguchi International Conference on Discovery Science . 2005

机译：基于相关性差异的挖掘隐式项目集对算法
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. Efficiently Hiding Sensitive Itemsets with Transaction Deletion Based on Genetic Algorithms [O] . Chun-Wei Lin, Binbin Zhang, Kuo-Tung Yang, -1

机译：基于遗传算法的交易隐藏有效隐藏敏感项集
7. An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations [O] . Taniguchi, Tsuyoshi, Haraguchi, Makoto 2005

机译：基于相关性差异的隐式项目集对挖掘算法

An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations

摘要

著录项

相似文献

相关主题

期刊订阅