首页> 外文期刊>SIGKDD explorations >Efficient Mining of the Most Significant Patterns with Permutation Testing
【24h】

Efficient Mining of the Most Significant Patterns with Permutation Testing

机译:有效采矿最重要的模式,具有排列测试

获取原文
获取原文并翻译 | 示例
       

摘要

The extraction of patterns displaying significant association with a class label is a key data mining task with wide application in many domains. We study a variant of the problem that requires to mine the top-k statistically significant patterns, thus providing tight control on the number of patterns reported in output. We develop TopKWY, the first algorithm to mine the top-k significant patterns while rigorously controlling the family-wise error rate of the output and provide theoretical evidence of its effectiveness. TopKWY crucially relies on a novel strategy to explore statistically significant patterns and on several key implementation choices, which may be of independent interest. Our extensive experimental evaluation shows that TopKWY enables the extraction of the most significant patterns from large datasets which could not be analyzed by the state-of-the-art. In addition, TopKWY improves over the state-of-the-art even for the extraction of all significant patterns.
机译:显示与类标签有效关联的模式的提取是具有广泛应用程序在许多域中的关键数据挖掘任务。 我们研究了挖掘Top-K统计显着模式的问题的变体,从而提供了对输出中报告的模式的数量的紧密控制。 我们开发Topkwy,这是第一算法来挖掘顶级重大模式,同时严格控制输出的家庭明智的错误率,并提供其有效性的理论证据。 Topkwy大致依赖于新颖的战略来探索统计上显着的模式以及几个关键实施选择,这可能具有独立利益。 我们广泛的实验评估表明,Topkwy能够提取来自最先进的大型数据集的最重要的模式,这些模式无法通过最先进的。 此外,甚至仍然可以提取所有重要模式的最先进的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号