【24h】

On Support Thresholds in Associative Classification

机译:关联分类中的支持阈值

获取原文
获取原文并翻译 | 示例

摘要

Associative classification is a well-known technique for structured data classification. Most previous works on associative classification use support based pruning for rule extraction, and usually set the threshold value to 1%. This threshold allows rule extraction to be tractable and on the average yields a good accuracy. We believe that this threshold may be not accurate in some cases, since the class distribution in the dataset is not taken into account. In this paper we investigate the effect of support threshold on classification accuracy. Lower support thresholds are often unfeasible with current extraction algorithms, or may cause the generation of a huge rule set. To observe the effect of varying the support threshold, we first propose a compact form to encode a complete rule set. We then develop a new classifier, named L_G~3, based on the compact form. Taking advantage of the compact form, the classifier can be built also with rather low support rules. We ran a variety of experiments with different support thresholds on datasets from the UCI machine learning database repository. The experiments showed that the optimal accuracy is obtained for variable threshold values, sometime lower than 1%.
机译:关联分类是用于结构化数据分类的众所周知的技术。先前有关关联分类的大多数工作都使用基于支持的修剪来提取规则,通常将阈值设置为1%。此阈值使规则提取变得容易处理,并且平均而言,它会产生良好的准确性。我们认为在某些情况下此阈值可能不准确,因为未考虑数据集中的类分布。在本文中,我们研究了支持阈值对分类准确性的影响。较低的支持阈值对于当前的提取算法通常是不可行的,或者可能导致生成庞大的规则集。为了观察改变支持阈值的影响,我们首先提出一种紧凑形式来编码完整的规则集。然后,根据紧凑形式,开发一个新的分类器,名为L_G〜3。利用紧凑的形式,也可以使用较低的支持规则来构建分类器。我们对UCI机器学习数据库存储库中的数据集进行了不同支持阈值的各种实验。实验表明,对于可变阈值,有时低于1%可获得最佳精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号