首页> 外文期刊>Expert Systems with Application >OOIMASP: Origin based association rule mining with order independent mostly associated sequential patterns
【24h】

OOIMASP: Origin based association rule mining with order independent mostly associated sequential patterns

机译:OOIMASP:基于原点的关联规则挖掘,具有顺序无关的主要关联顺序模式

获取原文
获取原文并翻译 | 示例

摘要

Efficient mining of association rules on a transaction dataset is an interesting and a challenging problem. The state-of-the-art MASP algorithm is dependent on the order of items in the transaction. We propose OOIMASP algorithm, which has two novel properties- 1) order independence and 2) it takes into consideration the origin of items to calculate unbiased support and unbiased confidence values. Order dependence is one of the drawbacks of MASP. OOIMASP addresses this issue by rearranging the items in transactions using a greedy frequency based approach. We compare the performance of our system with MASP on five synthetic data sets and three public data sets. The results show that our proposed approach outperforms the MASP in both the comparison metrics, i.e., the number of association rules generated and the length of the longest association rule. Both these metrics are important to evaluate the performance of an algorithm. On an average, OOIMASP algorithm generates 632% longer rules and 457% more association rules than MASP algorithm. The disadvantage of the proposed algorithm is, it requires more computational resources in terms of time, approximately 5 times more than MASP. We claim that the extra information extracted using our method compensates for the increase in time complexity as compared to MASP, The proposed method produces multiple trees which can be very useful in the visual analysis of data. (C) 2017 Elsevier Ltd. All rights reserved.
机译:在事务数据集上高效挖掘关联规则是一个有趣且具有挑战性的问题。最新的MASP算法取决于交易中项目的顺序。我们提出了OOIMASP算法,该算法具有两个新颖的属性-1)顺序独立性和2)考虑项目的来源来计算无偏支持和无偏置信度值。顺序依赖是MASP的缺点之一。 OOIMASP通过使用基于贪婪频率的方法重新安排交易中的项目来解决此问题。我们在五个综合数据集和三个公共数据集上比较了使用MASP的系统的性能。结果表明,在比较指标(即生成的关联规则数量和最长的关联规则长度)方面,我们提出的方法均优于MASP。这两个指标对于评估算法的性能都很重要。平均而言,OOIMASP算法生成的规则比MASP算法长632%,关联规则多457%。提出的算法的缺点是,就时间而言,它需要更多的计算资源,大约是MASP的5倍。我们声称,与MASP相比,使用我们的方法提取的额外信息可以弥补时间复杂度的增加。所提出的方法可生成多个树,这些树在数据的可视化分析中非常有用。 (C)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号