HIBoost: A hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification

Wu Qin; Lin Yaping; Zhu Tuanfei; Zhang Yue

首页> 外文期刊>Journal of intelligent & fuzzy systems: Applications in Engineering and Technology >HIBoost: A hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification

【24h】

HIBoost: A hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification

机译：Hiboost：高维不平衡数据分类的载体感知合奏学习算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Learning from high-dimensional imbalanced data is prevalent in many vital real-world applications, which poses a severe challenge to traditional data mining and machine learning algorithms. The existing works generally use dimension reduction methods to deal with the curse of dimensionality, then apply traditional imbalance learning techniques to combat the problem of class imbalance. However, dimensionality reduction may cause the loss of useful information, especially for the minority classes. This paper introduces an ensemble-based method, HIBoost, to directly handle the imbalanced learning problem in high dimensional space. HIBoost takes into account the inherent high-dimensional hubness phenomenon, i.e., high-dimensional data tends to contain the singular points (hubs and anti-hubs) which frequently or rarely occur in k-nearest neighbors of other points. For the singular hubs and anti-hubs induced by high dimension, HIBoost introduces a discount factor to restrict the weight growth of them in the process of updating weight, so that the risk of over fitting can be reduced when training component classifiers. For class imbalance problem, HIBoost uses SMOTE to balance the training data in each iteration so as to alleviate the prediction bias of component classifiers. Experimental results based on sixteen high-dimensional imbalanced data sets demonstrate the effectiveness of HIBoost.

机译：从高维不平衡数据学习在许多重要的现实应用程序中是普遍的，这对传统数据挖掘和机器学习算法构成了严峻的挑战。现有的作品一般使用尺寸减少方法来处理维数维度，然后应用传统的不平衡学习技术来打击类别不平衡的问题。但是，减少维度可能导致损失有用的信息，特别是对于少数阶级。本文介绍了一种基于集合的方法，即直接处理高维空间中的不平衡学习问题。 Hiboost考虑了固有的高维毂性现象，即高维数据倾向于包含奇异点（集线器和抗毂），其经常或很少发生在其他点的k最近邻居中。对于由高尺寸引起的奇异轮毂和抗枢纽，河口ost引入了在更新重量的过程中限制它们的重量生长的折扣因子，从而在训练组件分类器时可以减少过度拟合的风险。对于类别不平衡问题，Hiboost使用Smote在每次迭代中平衡训练数据，以便缓解组件分类器的预测偏差。基于16个高维不平衡数据集的实验结果证明了河口ost的有效性。

著录项

来源
《Journal of intelligent & fuzzy systems: Applications in Engineering and Technology》 |2020年第1期|共12页
作者
Wu Qin; Lin Yaping; Zhu Tuanfei; Zhang Yue;
展开▼
作者单位

Hunan Univ Coll Informat Sci &

Engn Changsha 410082 Hunan Peoples R China;

Hunan Univ Coll Informat Sci &

Engn Changsha 410082 Hunan Peoples R China;

Changsha Univ Coll Comp Engn &

Appl Math Changsha Peoples R China;

Hunan Univ Coll Informat Sci &

Engn Changsha 410082 Hunan Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化系统;
关键词
Hubness; class imbalance; high dimension; SMOTE; Ada Boost;

机译：镂空;班级不平衡;高维;笑容;ADA提升;

相似文献

外文文献
中文文献
专利

1. HIBoost: A hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification [J] . Wu Qin, Lin Yaping, Zhu Tuanfei, Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第1期

机译：Hiboost：高维不平衡数据分类的载体感知合奏学习算法
2. Hubness-aware kNN classification of high-dimensional data in presence of label noise [J] . Tomasev Nenad, Buza Krisztian Neurocomputing . 2015,第jula21期

机译：存在标签噪声的高维数据的基于感知度的kNN分类
3. Ensemble learning-based filter-centric hybrid feature selection framework for high-dimensional imbalanced data [J] . Kim Jongmo, Kang Jaewoong, Sohn Mye Knowledge-Based Systems . 2021,第MAYa23期

机译：基于学习的默认的过滤的默认混合功能选择框架，用于高维不平衡数据
4. HUSBoost: A Hubness-Aware Boosting for High-Dimensional Imbalanced Data Classification [C] . Qin Wu, Yaping Lin, Tuanfei Zhu, International Conference on Machine Learning and Data Engineering . 2019

机译：HUSBoost：用于高维不平衡数据分类的增强感知能力
5. Alleviating class imbalance using data sampling: Examining the effects on classification algorithms. [D] . Napolitano, Amri E. 2006

机译：使用数据采样缓解类不平衡：检查对分类算法的影响。
6. Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification [O] . Mohammad Nazmul Haque, Nasimul Noman, Regina Berretta, 2011

机译：基于遗传算法的类不平衡数据分类异构集成搜索
7. Hubness-aware kNN classification of high-dimensional data in presence of label noise [O] . Tomasev Nenad, Buza Krisztián Antal 2015

机译：存在标签噪声的高维数据的基于感知度的kNN分类

HIBoost: A hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification

摘要

著录项

相似文献

相关主题

期刊订阅