首页> 外文期刊>Engineering Applications of Artificial Intelligence >Proposing a classifier ensemble framework based on classifier selection and decision tree
【24h】

Proposing a classifier ensemble framework based on classifier selection and decision tree

机译:提出基于分类器选择和决策树的分类器集成框架

获取原文
获取原文并翻译 | 示例

摘要

One of the most important tasks in pattern, machine learning, and data mining is classification problem. Introducing a general classifier is a challenge for pattern recognition communities, which enables one to learn each problem's dataset. Many classifiers have been proposed to learn any problem thus far. However, many of them have their own positive and negative aspects. So they are good only for specific problems. But there is no strong solution to recognize which classifier is better or good for a specific problem. Fortunately, ensemble learning provides a good way to have a near-optimal classifying system for any problem. One of the most challenging problems in classifier ensemble is introducing a suitable ensemble of base classifiers. Every ensemble needs diversity. It means that if a group of classifiers is to be a successful ensemble, they must be diverse enough to cover their errors. Therefore, during ensemble creation, a mechanism is needed to ensure that the ensemble classifiers are diverse. Sometimes this mechanism can select/remove a subset of base classifiers with respect to maintaining the diversity of the ensemble. This paper proposes a novel method, named the Classifier Selection Based on Clustering (CSBS), for ensemble creation. To insure diversity in ensemble classifiers, this method uses the clustering of classifiers technique. Bagging is used to produce base classifiers. During ensemble creation, every type of base classifier is the same as a decision tree classier or a multilayer perceptron classifier. After producing a number of base classifiers, CSBC partitions them by using a clustering algorithm. Then CSBC produces a final ensemble by selecting one classifier from each cluster. Weighted majority vote method is used as an aggregator function. In this paper we investigate the influence of cluster number on the performance of the CSBC method; we also probe how we can select a good approximate value for cluster number in any dataset. We base our study on a large number of real datasets of UCI repository to reach a definite result.
机译:分类,机器学习和数据挖掘中最重要的任务之一就是分类问题。引入通用分类器是模式识别社区的一项挑战,这使人们能够学习每个问题的数据集。迄今为止,已经提出了许多分类器来学习任何问题。但是,其中许多都有自己的积极和消极方面。因此,它们仅对特定问题有用。但是,没有强大的解决方案来识别哪个分类器对特定问题更好还是更好。幸运的是,集成学习提供了一个很好的方法来为任何问题提供接近最佳的分类系统。分类器集成中最具挑战性的问题之一就是引入合适的基础分类器集成。每个合奏都需要多样性。这意味着,如果一组分类器要成为一个成功的集成体,则它们必须足够多样化以覆盖其错误。因此,在集成创建过程中,需要一种机制来确保集成分类器是多样的。有时,此机制可以选择/删除基本分类器的子集,以保持整体的多样性。本文提出了一种新的方法,称为基于聚类的分类器选择(CSBS),用于集成创建。为了确保集合分类器的多样性,此方法使用分类器聚类技术。套袋用于产生基本分类器。在集成创建期间,每种类型的基础分类器都与决策树分类器或多层感知器分类器相同。生成许多​​基本分类器后,CSBC使用聚类算法对它们进行分区。然后,CSBC通过从每个聚类中选择一个分类器来产生最终合奏。加权多数投票法用作聚合函数。本文研究了簇数对CSBC方法性能的影响。我们还将探讨如何在任何数据集中为簇数选择一个好的近似值。我们的研究基于大量UCI存储库的真实数据集,以得出确定的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号