首页> 外文会议>Document Analysis Systems, DAS, 2008 Eighth IAPR Workshop on >New Oversampling Approaches Based on Polynomial Fitting for Imbalanced Data Sets
【24h】

New Oversampling Approaches Based on Polynomial Fitting for Imbalanced Data Sets

机译:基于多项式拟合的不平衡数据集过采样新方法

获取原文

摘要

In classification tasks, class-modular strategy has been widely used. It has outperformed classical strategy for pattern classification task in many applications. However, in some modular architecture, such as one against all in support vector machines classifier, the training dataset for one class risks to heavily outnumber the other classes. In this challenging situation, the trained classifier will accurately classify the majority class; nevertheless, it marginalizes the minority class. As a result, True Negatives rate (TNr) will be very high while the True Positives rate (TPr) will be low. The main goal of this work is to improve TPr without much sacrifice in TNr. In this paper, we propose oversampling the minority class using polynomial fitting functions. Four new approaches were proposed: star topology, bus topology, polynomial curve topology and mesh topology. Star and mesh topologies approach had led to the best performances.
机译:在分类任务中,类模块化策略已被广泛使用。在许多应用中,它的性能优于经典的模式分类任务策略。但是,在某些模块化体系结构中,例如一个在支持向量机分类器中相对于所有分类器,一类的训练数据集可能会大大超过其他类。在这种具有挑战性的情况下,训练有素的分类器将准确地对多数类进行分类;但是,它使少数群体处于边缘地位。结果,真否定率(TNr)将很高,而真正率(TPr)将很低。这项工作的主要目标是在不牺牲TNr的前提下提高TPr。在本文中,我们建议使用多项式拟合函数对少数类进行过采样。提出了四种新方法:星形拓扑,总线拓扑,多项式曲线拓扑和网格拓扑。星形和网格拓扑方法带来了最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号