首页> 外文会议>IEEE International Conference on Information Reuse and Integration >Investigating the Variation of Ensemble Size on Bagging-Based Classifier Performance in Imbalanced Bioinformatics Datasets
【24h】

Investigating the Variation of Ensemble Size on Bagging-Based Classifier Performance in Imbalanced Bioinformatics Datasets

机译:在不平衡的生物信息学数据集中研究基于集合的分类器性能上集合大小的变化

获取原文

摘要

Bagging ensemble techniques have been utilized effectively by practitioners in the field of bioinformatics to alleviate the problem of class imbalance and to improve the performance of classification models. However, many previous works have used bagging only with a single arbitrary number of iterations. In this study, we raise the question of what is the impact of altering the number of iterations/ensembles on the classification performance of bagging classifiers? To answer this question, we conducted an empirical study using four different choices of number of iterations (10, 20, 50, and 100) within the bagging algorithm, across 15 different imbalanced bioinformatics datasets. Our results indicate that the choice of 50 iterations performs slightly better than all others without any exception, but the difference in performance is statistically insignificant. Thus, we recommend bagging with 10 iterations because, it achieves quality classification results, additional iterations do not significantly improve performance, and, a smaller number of iterations would be computationally less costly. The unique contribution of this work is to examine the effects of the number of iterations on the classification performance of bagging classifiers in the context of imbalanced datasets in the bioinformatics field.
机译:套袋集成技术已被生物信息学领域的从业者有效地利用,以减轻类不平衡的问题并改善分类模型的性能。但是,许多以前的工作仅使用装袋进行一次任意数量的迭代。在这项研究中,我们提出一个问题,即改变迭代次数/集合数对装袋分类器的分类性能有何影响?为了回答这个问题,我们对15个不同的不平衡生物信息学数据集使用袋装算法中的四种不同的迭代次数选择(10、20、50和100)进行了一项实证研究。我们的结果表明,选择50个迭代的性能要比所有其他迭代好一点,没有任何例外,但是性能上的差异在统计上是微不足道的。因此,我们建议使用10次迭代进行装袋,因为它可以实现质量分类结果,其他迭代不会显着提高性能,并且迭代次数越少,计算成本就越低。这项工作的独特贡献在于,在生物信息学领域数据集不平衡的情况下,研究了迭代次数对装袋分类器分类性能的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号