Rough Sets in Imbalanced Data Problem: Improving Re-sampling Process

机译：不平衡数据中的粗糙集问题：改进重采样过程

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Imbalanced data problem is still one of the most interesting and important research subjects. The latest experiments and detailed analysis revealed that not only the underrepresented classes are the main cause of performance loss in machine learning process, but also the inherent complex characteristics of data. The list of discovered significant difficulty factors consists of the phenomena like class overlapping, decomposition of the minority class, presence of noise and outliers. Although there are numerous solutions proposed, it is still unclear how to deal with all of these issues together and correctly evaluate the class distribution to select a proper treatment (especially considering the real-world applications where levels of uncertainty are eminently high). Since applying rough sets theory to the imbalanced data learning problem could be a promising research direction, the improved re-sampling approach combining selective preprocessing and editing techniques is introduced in this paper. The novel technique allows both qualitative and quantitative data handling.

机译：数据不平衡问题仍然是最有趣和重要的研究课题之一。最新的实验和详细的分析表明，不仅代表不足的类是机器学习过程中性能下降的主要原因，而且还是数据固有的复杂特性。已发现的重大困难因素包括类别重叠，少数类别分解，噪声和异常值等现象。尽管提出了许多解决方案，但仍不清楚如何一起处理所有这些问题并正确评估类分布以选择适当的处理方法（尤其是考虑到不确定性水平非常高的实际应用）。由于将粗糙集理论应用于不平衡数据学习问题可能是一个有前途的研究方向，因此本文介绍了结合选择性预处理和编辑技术的改进的重采样方法。这项新技术可以进行定性和定量数据处理。

著录项

来源
《Computer Information Systems and Industrial Management》|2017年|459-469|共11页
会议地点 Bialystok(PL)
作者
Katarzyna Borowsa; Jaroslaw Stepaniuk;
展开▼
作者单位

Faculty of Computer Science, Bialystok University of Technology,Wiejska 45A, 15-351 Bialystok, Poland;

Faculty of Computer Science, Bialystok University of Technology,Wiejska 45A, 15-351 Bialystok, Poland;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Data preprocessing; Class imbalance; Rough sets; SMOTE; Oversampling; Undersampling;

机译：数据预处理；阶级失衡；粗糙集；枪击过度采样；欠采样;

相似文献

外文文献
中文文献
专利

1. Handling imbalanced data sets with synthetic boundary data generation using bootstrap re-sampling and AdaBoost techniques [J] . Putthiporn Thanathamathee, Chidchanok Lursinsap Pattern recognition letters . 2013,第12期

机译：使用自举重采样和AdaBoost技术处理具有合成边界数据生成的不平衡数据集
2. SMOTE-RSB_*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory [J] . Enislay Ramentol, Yaile Caballero, Rafael Bello, Knowledge and information systems . 2012,第2期

机译：SMOTE-RSB_ *：使用SMOTE和粗糙集理论的基于过采样和欠采样的混合预处理方法，用于高不平衡数据集
3. SMOTE-RSB *: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory [J] . Enislay Ramentol, Yailé Caballero, Rafael Bello, Knowledge and Information Systems . 2012,第2期

机译：SMOTE-RSB * ：一种基于过采样和欠采样的混合预处理方法，使用SMOTE和粗糙集理论处理高不平衡数据集
4. Rough Sets in Imbalanced Data Problem: Improving Re-sampling Process [C] . Katarzyna Borowska, Jaroslaw Stepaniuk IFIP TC8 International Conference on Computer Information Systems and Industrial Management . 2017

机译：不平衡数据问题的粗糙集：改善重新采样过程
5. Uncertainty processing in a relational database model via a rough set representation. [D] . Beaubouef, Theresa Ann. 1994

机译：通过粗糙集表示在关系数据库模型中进行不确定性处理。
6. Risk factor analysis of device-related infections: value of re-sampling method on the real-world imbalanced dataset [O] . Xiang-Fei Feng, Ling-Chao Yang, Li-Zhuang Tan, 2019

机译：设备相关感染的危险因素分析：重新采样方法在现实世界中不平衡数据集上的价值
7. Imbalanced Data Classification: A Novel Re-sampling Approach Combining Versatile Improved SMOTE and Rough Sets [O] . Borowska, Katarzyna, Stepaniuk, Jarosław 2016

机译：数据不平衡分类：一种结合了通用改进SMOTE和粗糙集的新颖重采样方法
8. Seventh International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing (RSFDGrC'99) [R] . Zhong, N. , Skowron, A. , Ohsuga, S. 1999

机译：第七届粗糙集，模糊集，数据挖掘和粒度软计算国际研讨会（RsFDGrC'99）

Rough Sets in Imbalanced Data Problem: Improving Re-sampling Process

摘要

著录项

相似文献

相关主题

期刊订阅