首页> 外文会议>International Conference on Electrical Engineering and Informatics >Handling imbalanced dataset in multi-label text categorization using Bagging and Adaptive Boosting
【24h】

Handling imbalanced dataset in multi-label text categorization using Bagging and Adaptive Boosting

机译:使用Bagging和Adaptive Boosting处理多标签文本分类中的不平衡数据集

获取原文

摘要

Imbalanced dataset is occurred due to uneven distribution of data available in the real world such as disposition of complaints on government offices in Bandung. Consequently, multi-label text categorization algorithms may not produce the best performance because classifiers tend to be weighed down by the majority of the data and ignore the minority. In this paper, Bagging and Adaptive Boosting algorithms are employed to handle the issue and improve the performance of text categorization. The result is evaluated with four evaluation metrics such as hamming loss, subset accuracy, example-based accuracy and micro-averaged f-measure. Bagging.ML-LP with SMO weak classifier is the best performer in terms of subset accuracy and example-based accuracy. Bagging.ML-BR with SMO weak classifier has the best micro-averaged f-measure among all. In other hand, AdaBoost.MH with J48 weak classifier has the lowest hamming loss value. Thus, both algorithms have high potential in boosting the performance of text categorization, but only for certain weak classifiers. However, bagging has more potential than adaptive boosting in increasing the accuracy of minority labels.
机译:数据集的不平衡是由于现实世界中可用数据的分布不均,例如对万隆政府部门的投诉处理。因此,多标签文本分类算法可能不会产生最佳性能,因为分类器往往会被大多数数据权衡,而忽略少数数据。本文采用Bagging和Adaptive Boosting算法来处理该问题并提高文本分类的性能。使用汉密尔顿损失,子集准确性,基于示例的准确性和微平均f度量之类的四个评估指标对结果进行评估。就子集准确性和基于示例的准确性而言,具有SMO弱分类器的Bagging.ML-LP表现最佳。 Bagging。带有SMO弱分类器的ML-BR在所有方法中具有最佳的微平均f测度。另一方面,具有J48弱分类器的AdaBoost.MH具有最低的汉明损失值。因此,这两种算法都具有提高文本分类性能的巨大潜力,但仅适用于某些弱分类器。但是,在提高少数族裔标签的准确性方面,与自适应提升相比,装袋具有更大的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号