...
首页> 外文期刊>Journal of grid computing >A Dynamic Spark-based Classification Framework for Imbalanced Big Data
【24h】

A Dynamic Spark-based Classification Framework for Imbalanced Big Data

机译:基于动态的火花基分类框架,用于基于Big Data

获取原文
获取原文并翻译 | 示例
           

摘要

Classification of imbalanced big data has assembled an extensive consideration by many researchers during the last decade. Standard classification methods poorly diagnosis the minority class samples. Several approaches have been introduced for solving the problem of class imbalance in big data to enhance the generalization in classification. However, most of these approaches neglect the effect of border samples on classification performance; the high impact border samples might expose to misclassification. In this paper, a Spark Based Mining Framework (SBMF) is proposed to address the imbalanced data problem. Two main modules are designed for this purpose. The first is the Border Handling Module (BHM) which under samples the low impact majority border instances and oversamples the minority class instances. The second module is the Selective Border Instances sampling (SBI) Module, which enhances the output of the BHM module. The performance of the SBMF framework is evaluated and compared with other recent systems. A number of experiments were performed using moderate and big datasets with different imbalanced ratio. The results obtained from SBMF framework, when compared to the recent works, show better performance for the different datasets and classifiers.
机译:不平衡大数据的分类已经在过去十年中组装了许多研究人员的广泛考虑。标准分类方法诊断少数阶级样本差。介绍了解决大数据中类别不平衡问题的几种方法,以提高分类中的概括。然而,大多数方法都忽视了边界样本对分类性能的影响;高冲击边界样本可能会暴露于错误分类。在本文中,提出了一种基于火花的挖掘框架(SBMF)来解决不平衡数据问题。两个主要模块是为此目的而设计的。首先是边境处理模块(BHM),该模块(BHM)在对低碰撞的多数边界实例上进行了样本,并破坏了少数阶层实例。第二个模块是选择边界实例采样(SBI)模块,其增强了BHM模块的输出。评估SBMF框架的性能并与最近的系统进行比较。使用具有不同不平衡率的中等和大数据集进行许多实验。与最近的作品相比,从SBMF框架获得的结果为不同的数据集和分类器表示更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号