首页> 外文期刊>International Journal of Hybrid Intelligent Systems >On diversity and accuracy of homogeneous and heterogeneous ensembles
【24h】

On diversity and accuracy of homogeneous and heterogeneous ensembles

机译:均质和异质合奏的多样性和准确性

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The ensemble learning approach has been increasingly used in data mining for improving performance. However, the gain on the learning performance appears varying considerably from application to application. In some cases there were little or no gains achieved even when the same ensemble paradigms were used. This means that there are still some problems in understanding some basic and fundamental issues in ensemble methodology, especially on the factors that can affect the performance of an ensemble and the strategies for constructing effective ensembles. This paper attempts to address these issues. It first describes the possible influencing factors and then focuses on investigating the most important factor - diversity and its relationships with the accuracy of ensemble. In this study, two types of ensembles - homogeneous and heterogeneous ensembles are defined and constructed by using ten different learning algorithms and their diversity and accuracy are evaluated in order to find out which types of ensemble possess high diversity and are thus more accurate. For each of the ten learning algorithms, its ability for generating different types of diversity is estimated quantitatively by using ten common diversity measures and their characteristics are then analyzed to establish their correlation with ensemble performance. The study used fifteen popular data sets to verify the consistence and reliability of our experimental findings.
机译:集成学习方法已越来越多地用于数据挖掘中以提高性能。但是,学习性能的提高似乎因应用程序而异。在某些情况下,即使使用相同的集成范例,也几乎没有获得收益。这意味着在理解合奏方法学中的一些基本和基本问题时,尤其是可能影响合奏性能的因素以及构建有效合奏的策略方面,仍然存在一些问题。本文试图解决这些问题。它首先描述了可能的影响因素,然后着重于研究最重要的因素-多样性及其与整体准确性的关系。在这项研究中,通过使用十种不同的学习算法定义和构造了两种类型的合奏-同质和异类合奏,并评估了它们的多样性和准确性,以找出哪种类型的合奏具有较高的多样性,因此更准确。对于这十种学习算法中的每一种,通过使用十种常见的多样性度量来定量地估计其生成不同类型多样性的能力,然后分析其特征以建立它们与整体表现的相关性。该研究使用了15个流行的数据集来验证我们实验结果的一致性和可靠性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号