【24h】

Efficiently Mining Interesting Emerging Patterns

机译:有效地挖掘有趣的新兴模式

获取原文
获取外文期刊封面目录资料

摘要

Emerging patterns (EPs) are itemsets whose supports change significantly from one class to another. It has been shown that they are very powerful distinguishable features and they are very useful for constructing accurate classifiers. Previous EP mining approaches often produce a large number of EPs, which makes it very difficult to choose interesting ones manually. Usually, a post-processing filter step is applied for selecting interesting EPs based on some interestingness measures. In this paper, we first generalize the interestingness measures for EPs, including the minimum support, the minimum growth rate, the subset relationship between EPs and the correlation based on common statistical measures such as chi-squared value. We then develop an efficient algorithm for mining only those interesting EPs, where the chi-squared test is used as heuristic to prune the search space. The experimental results show that our algorithm maintains efficiency even at low supports on data that is large, dense and has high dimensionality. They also show that the heuristic is admissible, because only unimportant EPs with low supports are ignored. Our work based on EPs for classification confirms that the discovered interesting EPs are excellent candidates for building accurate classifiers.
机译:新兴模式(EPS)是项目集,其支持从一个类到另一类的支持变化。已经表明,它们是非常强大的可区分特征,它们对于构建精确的分类器非常有用。以前的EP挖掘方法经常产生大量的EPS,这使得手动选择有趣的EPS。通常,应用后处理过滤步骤以基于一些有趣的测量来选择有趣的EPS。在本文中,我们首先概括了EPS的有趣措施,包括最低支持,最低增长率,EPS之间的子集关系,基于Chi方价的常见统计措施。然后,我们开发了一个仅用于挖掘那些有趣的EPS的有效算法,其中CHI方向被用作启发式来修剪搜索空间。实验结果表明,我们的算法甚至在较大,密集和具有高维度的数据的低支持下保持效率。他们还表明启发式是可以禁止的,因为只有低支撑的不重要的EPS都被忽略了。我们基于EPS进行分类的工作证实,发现有趣的EPS是建立准确分类器的优秀候选者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号