首页> 外文期刊>Expert Systems with Application >A priori synthetic over-sampling methods for increasing classification sensitivity in imbalanced data sets
【24h】

A priori synthetic over-sampling methods for increasing classification sensitivity in imbalanced data sets

机译:用于提高不平衡数据集中分类敏感性的先验合成过采样方法

获取原文
获取原文并翻译 | 示例

摘要

Building accurate classifiers for predicting group membership is made difficult when using data that is skewed or imbalanced which is typical of real world data sets. The classifier has a tendency to be biased towards the over represented or majority group as a result. Re-sampling techniques offer simple approaches that can be used to minimize the effect. Over-sampling methods aim to combat class imbalance by increasing the number of minority group samples also refereed to as members of the minority group. Over the last decade SMOTE based methods have been used and extended to overcome this problem. There has been little emphasis on improvements, to this approach with consideration to data intrinsic properties beyond that of class imbalance alone. In this paper we introduce modifications to a priori based methods Safe Level OUPS and OUPS that result in improvement for sensitivity measures over competing approaches using the SMOTE based method such as the Local neighborhood extension to SMOTE (LN-SMOTE), Borderline-SMOTE and Safe-Level-SMOTE. (C) 2016 Elsevier Ltd. All rights reserved.
机译:当使用偏斜或不平衡的数据(现实世界数据集的典型数据)时,很难建立用于预测组成员资格的准确分类器。结果,分类器倾向于偏向代表人数过多或多数的群体。重采样技术提供了可用于最小化影响的简单方法。过采样方法旨在通过增加少数群体样本(也称为少数群体成员)的数量来消除阶级失衡。在过去的十年中,已经使用并扩展了基于SMOTE的方法来克服此问题。对于这种方法,几乎​​没有强调改进,只考虑了类固有的数据固有属性之外的数据固有属性。在本文中,我们介绍了对基于先验方法的安全级别OUPS和OUPS的修改,与基于SMOTE的方法(如SMOTE的本地邻域扩展(LN-SMOTE),Borderline-SMOTE和Safe)相比,竞争方法的敏感性得以提高-水平射击。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号