Fuzzy integral-based ELM ensemble for imbalanced big data classification

Junhai Zhai; Sufang Zhang; Mingyang Zhang; Xiaomeng Liu

首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >Fuzzy integral-based ELM ensemble for imbalanced big data classification

【24h】

Fuzzy integral-based ELM ensemble for imbalanced big data classification

机译：基于模糊的基于积分的ELM合奏，用于Big Data Classification

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Big data are data too big to be handled and analyzed by traditional software tools, big data can be characterized by five V’s features: volume, velocity, variety, value and veracity. However, in the real world, some big data have another feature, i.e., class imbalanced, such as e-health big data, credit card fraud detection big data and extreme weather forecast big data are all class imbalanced. In order to deal with the problem of classifying binary imbalanced big data, based on MapReduce, non-iterative learning, ensemble learning and oversampling, this paper proposed an promising algorithm which includes three stages. Firstly, for each positive instance, its enemy nearest neighbor is found with MapReduce, and p positive instances are randomly generated with uniform distribution in its enemy nearest neighbor hypersphere, i.e., oversampling p positive instances within the hypersphere. Secondly, l balanced data subsets are constructed and l classifiers are trained on the constructed data subsets with an non-iterative learning approach. Finally, the trained classifiers are integrated by fuzzy integral to classify unseen instances. We experimentally compared the proposed algorithm with three related algorithms: SMOTE, SMOTE+RF-BigData and MR-V-ELM, and conducted a statistical analysis on the experimental results. The experimental results and the statistical analysis demonstrate that the proposed algorithm outperforms the other three methods.

机译：大数据是通过传统的软件工具来处理和分析的数据太大，大数据可以特征在于五V的特点：体积，速度，品种，价值和准确性。然而，在现实世界中，一些大数据有另一个功能，即类不平衡，例如电子健康大数据，信用卡欺诈检测大数据和极端天气预报大数据都是阶级的阶级。为了应对分类二元商品的大数据的问题，基于MapReduce，非迭代学习，集合学习和过采样，提出了一种有前途的算法，包括三个阶段。首先，对于每个阳性实例，它的敌人最近的邻居找到了MapReduce，并且P正面情况是随机生成的，在其敌人最近的邻近的超短3Sphersphere，即超采样在极度内的过采样P积极实例。其次，构建L平衡数据子集，并且L分类器在具有非迭代学习方法的构建数据子集上培训。最后，训练有素的分类器是通过模糊积分集成的，以分类看不见的实例。我们通过实验比较了三种相关算法的提出算法：Smote，Smote + RF-BigData和MR-V-Elm，并对实验结果进行了统计分析。实验结果和统计分析表明，所提出的算法优于其他三种方法。

著录项

来源
《Soft computing: A fusion of foundations, methodologies and applications》 |2018年第11期|共13页
作者
Junhai Zhai; Sufang Zhang; Mingyang Zhang; Xiaomeng Liu;
展开▼
作者单位

Key Laboratory of Machine Learning and Computational Intelligence College of Mathematics and Information Science Hebei University;

Hebei Branch of China Meteorological Administration Training Centre China Meteorological Administration;

Key Laboratory of Machine Learning and Computational Intelligence College of Mathematics and Information Science Hebei University;

Key Laboratory of Machine Learning and Computational Intelligence College of Mathematics and Information Science Hebei University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
Imbalanced big data; MapReduce; Non-iterative learning; Oversampling; Fuzzy integral;

机译：不平衡大数据;MapReduce;非迭代学习;过采样;模糊积分;

相似文献

外文文献
中文文献
专利

1. Fuzzy integral-based ELM ensemble for imbalanced big data classification [J] . Junhai Zhai, Sufang Zhang, Mingyang Zhang, Soft computing: A fusion of foundations, methodologies and applications . 2018,第11期

机译：基于模糊的基于积分的ELM合奏，用于Big Data Classification
2. Meta-learning for imbalanced data and classification ensemble in binary classification [J] . Sung-Chiang Lin, Yuan-chin I. Chang, Wei-Ning Yang Neurocomputing . 2009,第1a3期

机译：二元分类中不平衡数据的元学习和分类集成
3. A hybrid method based on ensemble WELM for handling multi class imbalance in cancer microarray data [J] . Liu Zhen, Tang Deyu, Cai Yongming, Neurocomputing . 2017,第nova29期

机译：基于集合WELM的混合方法处理癌症微阵列数据中的多类不平衡
4. Constructing Support Vector Machines Ensemble Classification Method for Imbalanced Datasets Based on Fuzzy Integral [C] . Pu Chen, Dayong Zhang International conference on industrial engineering and other applications of applied intelligence systems . 2014

机译：基于模糊积分的不平衡数据集支持向量机集合分类方法的构建
5. Fractional Random Weighted Bootstrapping for Classi?cation on Imbalanced Data with Ensemble Decision Tree Methods [D] . Carter, Sean Charles. 2019

机译：具有集合决策树方法的分数随机加权自动启动，用于分类数据
6. New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification [O] . Xiaoqing Gu, Tongguang Ni, Hongyuan Wang -1

机译：用于医疗数据集分类中类别不平衡问题的新型模糊支持向量机
7. Fuzzy Integral-based Neural Network Ensemble for Facial Expression Recognition [O] . Z.Y Wang, N.F Xiao 2015

机译：基于模糊的基于积分的神经网络组合面部表情识别

Fuzzy integral-based ELM ensemble for imbalanced big data classification

摘要

著录项

相似文献

相关主题

期刊订阅