MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning

Barua Sukarna; Islam Md.Monirul; Yao Xin; Murase Kazuyuki

首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning

【24h】

MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning

机译：MWMOTE-用于不平衡数据集学习的多数加权少数过采样技术

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Imbalanced learning problems contain an unequal distribution of data samples among different classes and pose a challenge to any classifier as it becomes hard to learn the minority class samples. Synthetic oversampling methods address this problem by generating the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. This paper identifies that most of the existing oversampling methods may generate the wrong synthetic minority samples in some scenarios and make learning tasks harder. To this end, a new method, called Majority Weighted Minority Oversampling TEchnique (MWMOTE), is presented for efficiently handling imbalanced learning problems. MWMOTE first identifies the hard-to-learn informative minority class samples and assigns them weights according to their euclidean distance from the nearest majority class samples. It then generates the synthetic samples from the weighted informative minority class samples using a clustering approach. This is done in such a way that all the generated samples lie inside some minority class cluster. MWMOTE has been evaluated extensively on four artificial and 20 real-world data sets. The simulation results show that our method is better than or comparable with some other existing methods in terms of various assessment metrics, such as geometric mean (G-mean) and area under the receiver operating curve (ROC), usually known as area under curve (AUC).

机译：不平衡的学习问题包含不同类别之间数据样本的不均匀分布，并且由于难以学习少数类别样本，因此对任何分类器都构成了挑战。合成过采样方法通过生成合成少数类样本以平衡多数类和少数类样本之间的分布来解决此问题。本文指出，大多数现有的过采样方法在某些情况下可能会生成错误的合成少数样本，并使学习任务更加困难。为此，提出了一种称为多数加权少数过采样技术（MWMOTE）的新方法，用于有效处理不平衡的学习问题。 MWMOTE首先识别难以学习的信息丰富的少数族裔样本，然后根据它们与最近的多数族裔样本之间的欧氏距离来分配权重。然后，它使用聚类方法从加权的信息丰富的少数类样本中生成合成样本。这样做的方式是，所有生成的样本都位于某个少数类集群中。 MWMOTE已在四个人工和20个真实数据集上进行了广泛评估。仿真结果表明，我们的方法在各种评估指标（例如几何平均值（G均值）和接收器工作曲线下的面积（ROC），通常称为曲线下面积）方面优于或与其他现有方法相当。（AUC）。

著录项

来源
《IEEE Transactions on Knowledge and Data Engineering》 |2014年第2期|405-425|共21页
作者
Barua Sukarna; Islam Md.Monirul; Yao Xin; Murase Kazuyuki;
展开▼
作者单位

BUET, Dhaka|c|;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Imbalanced learning; clustering; oversampling; synthetic sample generation; undersampling;

机译：学习失衡;聚类;过采样;合成样本生成;欠采样;

相似文献

外文文献
中文文献
专利

1. Fuzzy–synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets [J] . Yanping Xu, Chunhua Wu, Kangfeng Zheng, International Journal of Distributed Sensor Networks . 2017,第4期

机译：模糊综合少数群体过采样技术：基于模糊集理论的过采样用于不平衡数据集中的Android恶意软件检测
2. An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem [J] . Chao-Ran Wang, Xin-Hui Shao Quality Control, Transactions . 2021,第1期

机译：一种改进多数加权少数少数少数人超法，用于实施分类问题
3. NI-MWMOTE: An improving noise-immunity majority weighted minority oversampling technique for imbalanced classification problems [J] . Wei Jianan, Huang Haisong, Yao Liguo, Expert systems with applications . 2020,第Nova期

机译：NI-MWMOTE：一种提高抗噪性多数加权少数少数少数群体过采样技术，用于不平衡分类问题
4. A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning [C] . Sukarna Barua, Md. Monirul Islam, Kazuyuki Murase International conference on neural information processing;ICONIP 2011 . 2011

机译：一种新的不平衡数据集学习中的综合少数民族过采样技术
5. Proxy Relearning for Feature-Driven Pattern Recognition in High-Dimensional Imbalanced Time Series Data Sets [D] . Cho, Wilfred Yau-Chuen. 2017

机译：高维不平衡时间序列数据集中特征驱动模式识别的代理重新学习
6. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification [O] . Jinyan Li, Simon Fong, Yunsick Sung, 2016

机译：生物医学数据分类中基于二元不平衡数据集的自适应群聚动态多目标综合少数抽样技术算法
7. Penanganan imbalance class data laboratorium kesehatan dengan Majority Weighted Minority Oversampling Technique [O] . Meida Cahyo Untoro, Joko Lianto Buliali 2018

机译：不平衡处理课程数据实验室健康与多数加权少数群体过采样技术

MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning

摘要

著录项

相似文献

相关主题

期刊订阅