【24h】

Selecting the Right Interestingness Measure for Association Patterns

机译:为关联模式选择正确的兴趣度度量

获取原文

摘要

Many techniques for association rule mining and feature selection require a suitable metric to capture the dependencies among variables in a data set. For example, metrics such as support, confidence, lift, correlation, and collective strength are often used to determine the interestingness of association patterns. However, many such measures provide conflicting information about the interestingness of a pattern, and the best metric to use for a given application domain is rarely known. In this paper, we present an overview of various measures proposed in the statistics, machine learning and data mining literature. We describe several key properties one should examine in order to select the right measure for a given application domain. A comparative study of these properties is made using twenty one of the existing measures. We show that each measure has different properties which make them useful for some application domains, but not for others. We also present two scenarios in which most of the existing measures agree with each other, namely, support-based pruning and table standardization. Finally, we present an algorithm to select a small set of tables such that an expert can select a desirable measure by looking at just this small set of tables.
机译:用于关联规则挖掘和特征选择的许多技术都需要合适的度量来捕获数据集中变量之间的依存关系。例如,经常使用诸如支持,信心,提升,相关性和集体力量之类的指标来确定关联模式的趣味性。但是,许多这样的措施提供了关于模式的趣味性的相互矛盾的信息,并且很少知道用于给定应用程序域的最佳度量。在本文中,我们对统计,机器学习和数据挖掘文献中提出的各种措施进行了概述。我们描述了一些关键特性,为了选择给定的应用程序域,应该检查这些关键特性。使用二十一种现有措施对这些特性进行了比较研究。我们证明每种度量都有不同的属性,这使其对某些应用程序域有用,而对其他应用程序域则无用。我们还介绍了两种情况,其中大多数现有措施彼此一致,即基于支持的修剪和表标准化。最后,我们提出一种算法来选择一小组表,以便专家可以仅通过查看这小组表来选择所需的度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号