Analysis of Focussed Under-Sampling Techniques with Machine Learning Classifiers

机译：具有机器学习分类器的聚焦下采样技术分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Class Imbalance Problem is the major issue in machine intelligence producing biased classifiers that work well for the majority class but have a relatively poor performance for the minority class. To ensure the development of accurate prediction models, it is essential to deal with the class imbalance problem. In this paper, the class imbalance problem is handled using focused undersampling techniques viz. Cluster Based, Tomek Link and Condensed Nearest Neighbours which equalize the number of instances of the two types of classes by undersampling the majority class based on some particular criteria. This is in contrast to random undersampling where the data samples are selected randomly from the majority class leading to underfitting and loss of some important datapoints. To fairly compare and evaluate the performance of focused undersampling approaches, prediction models are constructed using popular machine learning classifiers like K-Nearest Neighbor, Decision Tree and Naive Bayes. The results have shown that Decision Tree outperformed other machine learning techniques. Comparing and contrasting the undersampling approaches for Decision Tree concluded Condensed Nearest Neighbours to be best amongst others.

机译：类不平衡问题是机器智能的主要问题，生产偏置分类器，对大多数阶级工作良好，但对少数阶级的表现相对较差。为确保开发准确的预测模型，必须处理班级不平衡问题。在本文中，使用聚焦的欠采样技术viz处理类别不平衡问题。基于群集的，Tomek链路和浓缩的最近邻居，它通过基于一些特定标准阐明了多数类来均衡两种类型类的实例数。这与随机欠采样形成对比，其中数据样本从大多数类中随机选择，导致一些重要的数据点的磨损和丢失。为了公平地比较和评估聚焦欠采样方法的性能，使用像K-Collect邻居，决策树和天真贝叶斯等流行的机器学习分类器构建预测模型。结果表明，决策树优于其他机器学习技术。比较和对比决策树的非采样方法结束了凝聚率最近的邻居在其他方面是最好的。

著录项

来源
《IEEE/ACIS International Conference on Software Engineering Research, Management and Applications》|2021年|91-96|共6页
会议地点
作者
Ankita Bansal; Abha Jain;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Oils; Sea measurements; Network intrusion detection; Machine learning; Web mining; Predictive models; Decision trees;

机译：油;海量测量;网络入侵检测;机器学习;网挖;预测模型;决策树;

相似文献

外文文献
中文文献
专利

1. Non-intrusive load monitoring using artificial intelligence classifiers: Performance analysis of machine learning techniques [J] . Monteiro R. V. A., de Santana J. C. R., Teixeira R. F. S., Electric power systems research . 2021,第Sepa期

机译：使用人工智能分类器的非侵入式负荷监测：机器学习技术的性能分析
2. Analysis and comparison of machine learning classifiers and deep neural networks techniques for recognition of Farsi handwritten digits [J] . Nanehkaran Y. A., Zhang Defu, Salimi S., Journal of supercomputing . 2021,第4期

机译：机器学习分类器和深神经网络识别的分析与比较Farsi手写数字
3. Analysis of Personality and EEG Features in Emotion Recognition Using Machine Learning Techniques to Classify Arousal and Valence Labels [J] . Martínez-Tejada Laura Alejandra, Maruyama Yasuhisa, Yoshimura Natsue, Machine Learning and Knowledge Extraction . 2020,第2期

机译：使用机器学习技术对情感识别的个性和脑电特征分析唤醒和价值标签
4. Spatial under-sampling of ultrasound images using Fourier-based synthetic aperture focusing technique [C] . Rastello, T., Vray, . 1997

机译：使用基于傅立叶的合成孔径聚焦技术对超声图像进行空间欠采样
5. Exploring machine learning techniques using patient interactions in online health forums to classify drug safety. [D] . Chee, Brant Wah Kwong. 2011

机译：在在线健康论坛中使用患者互动来探索机器学习技术，以对药物安全性进行分类。
6. Machine Learning Techniques for Classifying the Mutagenic Origins of Point Mutations [O] . Yicheng Zhu, Cheng Soon Ong, Gavin A. Huttley 2020

机译：机器学习技术对点突变的诱因进行分类
7. Using Machine Learning Techniques to Classify and Predict Static Code Analysis Tool Warnings [O] . Enas A. Alikhashashneh, Rajeev R. Raje, James H. Hill 2018

机译：使用机器学习技术对静态代码分析工具警告进行分类和预测

Analysis of Focussed Under-Sampling Techniques with Machine Learning Classifiers

摘要

著录项

相似文献

相关主题

期刊订阅