Rough Sets in Imbalanced Data Problem: Improving Re-sampling Process

机译：不平衡数据问题的粗糙集：改善重新采样过程

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Imbalanced data problem is still one of the most interesting and important research subjects. The latest experiments and detailed analysis revealed that not only the underrepresented classes are the main cause of performance loss in machine learning process, but also the inherent complex characteristics of data. The list of discovered significant difficulty factors consists of the phenomena like class overlapping, decomposition of the minority class, presence of noise and outliers. Although there are numerous solutions proposed, it is still unclear how to deal with all of these issues together and correctly evaluate the class distribution to select a proper treatment (especially considering the real-world applications where levels of uncertainty are eminently high). Since applying rough sets theory to the imbalanced data learning problem could be a promising research direction, the improved re-sampling approach combining selective preprocessing and editing techniques is introduced in this paper. The novel technique allows both qualitative and quantitative data handling.

机译：不平衡数据问题仍然是最有趣和最重要的研究科目之一。最新的实验和详细分析表明，不仅持代表性的课程不仅是机器学习过程中性能损失的主要原因，而且是数据的固有复杂特性。被发现的显着难度因素的列表包括类别的现象，如类重叠，分解少数阶级，噪音和异常值的存在。虽然有提出了许多解决方案，目前还不清楚如何处理所有这些问题一起，正确评价类别分布来选择适当的治疗（特别是考虑到现实世界的应用，其中的不确定性水平突出地高）。由于将粗糙集理论应用于不平衡的数据学习问题，因此本文介绍了组合选择性预处理和编辑技术的改进的再采样方法。新颖的技术允许定性和定量数据处理。

著录项

来源
《IFIP TC8 International Conference on Computer Information Systems and Industrial Management》|2017年|710p|共11页
会议地点
作者
Katarzyna Borowska; Jaroslaw Stepaniuk;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Data preprocessing; Class imbalance; Rough sets SMOTE; Oversampling; Undersampling;

机译：数据预处理;类别不平衡;粗糙集少;过采样;under采样;

相似文献

外文文献
中文文献
专利

1. Handling imbalanced data sets with synthetic boundary data generation using bootstrap re-sampling and AdaBoost techniques [J] . Putthiporn Thanathamathee, Chidchanok Lursinsap Pattern recognition letters . 2013,第12期

机译：使用自举重采样和AdaBoost技术处理具有合成边界数据生成的不平衡数据集
2. SMOTE-RSB_*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory [J] . Enislay Ramentol, Yaile Caballero, Rafael Bello, Knowledge and information systems . 2012,第2期

机译：SMOTE-RSB_ *：使用SMOTE和粗糙集理论的基于过采样和欠采样的混合预处理方法，用于高不平衡数据集
3. SMOTE-RSB *: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory [J] . Enislay Ramentol, Yailé Caballero, Rafael Bello, Knowledge and Information Systems . 2012,第2期

机译：SMOTE-RSB * ：一种基于过采样和欠采样的混合预处理方法，使用SMOTE和粗糙集理论处理高不平衡数据集
4. Rough Sets in Imbalanced Data Problem: Improving Re-sampling Process [C] . Katarzyna Borowsa, Jaroslaw Stepaniuk Computer Information Systems and Industrial Management . 2017

机译：不平衡数据中的粗糙集问题：改进重采样过程
5. Uncertainty processing in a relational database model via a rough set representation. [D] . Beaubouef, Theresa Ann. 1994

机译：通过粗糙集表示在关系数据库模型中进行不确定性处理。
6. Risk factor analysis of device-related infections: value of re-sampling method on the real-world imbalanced dataset [O] . Xiang-Fei Feng, Ling-Chao Yang, Li-Zhuang Tan, 2019

机译：设备相关感染的危险因素分析：重新采样方法在现实世界中不平衡数据集上的价值
7. Imbalanced Data Classification: A Novel Re-sampling Approach Combining Versatile Improved SMOTE and Rough Sets [O] . Borowska, Katarzyna, Stepaniuk, Jarosław 2016

机译：数据不平衡分类：一种结合了通用改进SMOTE和粗糙集的新颖重采样方法
8. Seventh International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing (RSFDGrC'99) [R] . Zhong, N. , Skowron, A. , Ohsuga, S. 1999

机译：第七届粗糙集，模糊集，数据挖掘和粒度软计算国际研讨会（RsFDGrC'99）

Rough Sets in Imbalanced Data Problem: Improving Re-sampling Process

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅