首页> 外文期刊>Australasian Journal of Information Systems >Comparing sets of patterns with the Jaccard index
【24h】

Comparing sets of patterns with the Jaccard index

机译:将模式集与Jaccard索引进行比较

获取原文
           

摘要

The ability to extract knowledge from data has been the driving force of Data Mining since its inception, and of statistical modeling long before even that. Actionable knowledge often takes the form of patterns, where a set of antecedents can be used to infer a consequent. In this paper we offer a solution to the problem of comparing different sets of patterns. Our solution allows comparisons between sets of patterns that were derived from different techniques (such as different classification algorithms), or made from different samples of data (such as temporal data or data perturbed for privacy reasons). We propose using the Jaccard index to measure the similarity between sets of patterns by converting each pattern into a single element within the set. Our measure focuses on providing conceptual simplicity, computational simplicity, interpretability, and wide applicability. The results of this measure are compared to prediction accuracy in the context of a real-world data mining scenario.
机译:从数据挖掘开始就一直是数据挖掘的驱动力,甚至在此之前,统计挖掘就一直是数据挖掘的驱动力。可操作的知识通常采用模式的形式,其中可以使用一组先行词来推断结果。在本文中,我们为比较不同模式集提供了解决方案。我们的解决方案允许比较从不同技术(例如不同的分类算法)或从不同数据样本(例如时间数据或出于隐私原因而受到干扰的数据)得出的模式集之间的比较。我们建议使用Jaccard索引通过将每个模式转换为模式集内的单个元素来测量模式集之间的相似度。我们的措施重点在于提供概念上的简化,计算上的简化,可解释性和广泛的适用性。将该度量的结果与实际数据挖掘场景中的预测准确性进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号