Quartiles based UnderSampling(QUS): A Simple and Novel Method to increase the Classification rate of positives in Imbalanced Datasets

机译：基于Quartiles的欠采样（QU）：一种简单而新颖的方法，可以提高不平衡数据集中的阳性分类率

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The main challenge in learning from imbalanced datasets is the presence of a large set of training examples available for the negatives(majority class instances), and very few positives(minority class instances). This may result in a good overall performance of the classifier even though there is a huge red uction in the classification rate of positives. Quartiles based UnderSampling(QUS) method proposed in this paper, addresses the above problem in a simple way. That is balancing the dataset by selecting the negatives based on their similarity with respect to 5 quartiles: minimum, quartile1(Q1), median, quartile3(Q3) and maximum. Intention is to reduce the influence of excessive negatives on the classifier, which may bias it towards a better negatives classification otherwise. An advantage of this undersampling method is parameter independence and gives better results compared to the state of the art methods. The proposed method is tested on kNN (k Nearest Neighbour) classifier and empirical results improve the classification rate of positives than the original unprocessed imbalanced dataset.

机译：从非衡产数据集学习的主要挑战是存在一大一组培训示例，可用于否定（多数类实例），且少数级别实例）。这可能导致分类器的良好整体性能，即使在阳性的分类率上存在巨大的红色敏感性。本文提出的基于Quartiles的基于欠采样（QUS）方法，以简单的方式解决了上述问题。这是通过基于与5个四分位数的相似性选择否定来平衡数据集：最小，Quartile1（Q1），中位数，Quartile3（Q3）和最大值。意图是减少对分类器过度否定的影响，这可能会偏向更好的否定分类。与现有技术的状态相比，这种欠采样方法的一个优点是参数独立性，并提供更好的结果。该方法在KNN（K最近邻居）分类器上测试，经验结果提高了阳性的分类率，而不是原始未处理的不平衡数据集。

著录项

来源
《International Conference on Advances in Pattern Recognition》|2017年|439p|共6页
会议地点
作者
C.V. Krishna Veni; T. Sobha Rani;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词
Training; Sensitivity; Complexity theory; Indexes; Medical diagnosis; Credit cards; Linear matrix inequalities;

机译：培训;敏感性;复杂性理论;指数;医学诊断;信用卡;线性矩阵不等式;

相似文献

外文文献
中文文献
专利

1. Discussion on Vuttipittayamongkol, P. and Elyan, E., Improved Overlap-Based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson's Disease [J] . Fernandez Alberto International Journal of Neural Systems . 2020,第9期

机译：探讨Vuttitipittamongkol，P.和Elyan，E。，改进了基于重叠的缺乏采样，用于癫痫和帕金森病的应用程序分类
2. Response to Discussion on "Improved Overlap-Based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson's Disease," [J] . Vuttipittayamongkol Pattaramon, Elyan Eyad International Journal of Neural Systems . 2020,第9期

机译：讨论“利用癫痫和帕金森疾病的申请改善基于重叠的缺口采样的讨论”的讨论，“
3. Improved Overlap-based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson's Disease [J] . Vuttipittayamongkol Pattaramon, Elyan Eyad International Journal of Neural Systems . 2020,第8期

机译：改进基于重叠的缺乏采样，用于对癫痫和帕金森疾病的应用程序进行不平衡数据集分类
4. Quartiles based UnderSampling(QUS): A Simple and Novel Method to increase the Classification rate of positives in Imbalanced Datasets [C] . C.V. Krishna Veni, T. Sobha Rani International Conference on Advances in Pattern Recognition . 2017

机译：基于四分位数的欠采样（QUS）：一种简单新颖的方法，可以提高不平衡数据集中正值的分类率
5. Active learning with support vector machines for imbalanced datasets and a method for stopping active learning based on stabilizing predictions. [D] . Bloodgood, Michael. 2009

机译：支持向量机用于不平衡数据集的主动学习，以及一种基于稳定预测的主动学习停止方法。
6. Overlap-Based Undersampling Method for Classification of Imbalanced Medical Datasets [O] . Pattaramon Vuttipittayamongkol, Eyad Elyan -1

机译：基于重叠的欠采样分类医学数据集的方法
7. Overlap-Based Undersampling Method for Classification of Imbalanced Medical Datasets [O] . Pattaramon Vuttipittayamongkol, Eyad Elyan 2020

机译：基于重叠的非衡度医疗数据集分类的欠采样方法

Quartiles based UnderSampling(QUS): A Simple and Novel Method to increase the Classification rate of positives in Imbalanced Datasets

摘要

著录项

相似文献

相关主题

期刊订阅