首页> 外文期刊>Fuzzy sets and systems >Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data
【24h】

Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data

机译:MapReduce框架下基于费用敏感语言模糊规则的分类系统

获取原文
获取原文并翻译 | 示例

摘要

Classification with big data has become one of the latest trends when talking about learning from the available information. The data growth in the last years has rocketed the interest in effectively acquiring knowledge to analyze and predict trends. The variety and veracity that are related to big data introduce a degree of uncertainty that has to be handled in addition to the volume and velocity requirements. This data usually also presents what is known as the problem of classification with imbalanced datasets, a class distribution where the most important concepts to be learned are presented by a negligible number of examples in relation to the number of examples from the other classes. In order to adequately deal with imbalanced big data we propose the Chi-FRBCS-BigDataCS algorithm, a fuzzy rule based classification system that is able to deal with the uncertainly that is introduced in large volumes of data without disregarding the learning in the underrepresented class. The method uses the MapReduce framework to distribute the computational operations of the fuzzy model while it includes cost-sensitive learning techniques in its design to address the imbalance that is present in the data. The good performance of this approach is supported by the experimental analysis that is carried out over twenty-four imbalanced big data cases of study. The results obtained show that the proposal is able to handle these problems obtaining competitive results both in the classification performance of the model and the time needed for the computation.
机译:在谈论从可用信息中学习时,使用大数据进行分类已成为最新趋势之一。最近几年的数据增长激增了对有效获取知识以分析和预测趋势的兴趣。与大数据相关的多样性和准确性会带来一定程度的不确定性,除了数量和速度要求外,还必须处理这些不确定性。这些数据通常还会出现所谓的不平衡数据集分类问题,即类别分布,其中要学习的最重要概念是通过相对于来自其他类别的示例数量可忽略不计的示例数量来呈现的。为了充分处理不平衡的大数据,我们提出了Chi-FRBCS-BigDataCS算法,这是一个基于模糊规则的分类系统,能够处理大量数据中引入的不确定性,而不会忽视在代表性不足的课程中的学习。该方法使用MapReduce框架来分配模糊模型的计算操作,同时在其设计中包括对成本敏感的学习技术,以解决数据中存在的不平衡问题。通过对二十四个不平衡的大数据案例进行的实验分析,证明了这种方法的良好性能。获得的结果表明,该提案能够处理这些问题,从而在模型的分类性能和计算所需的时间上获得竞争性结果。

著录项

  • 来源
    《Fuzzy sets and systems》 |2015年第1期|5-38|共34页
  • 作者单位

    Dept. of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), University of Granada, Granada, Spain;

    Dept. of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), University of Granada, Granada, Spain;

    Dept. of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), University of Granada, Granada, Spain;

    Dept. of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), University of Granada, Granada, Spain;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Fuzzy rule based classification systems; Big data; MapReduce; Hadoop; Imbalanced datasets; Cost-sensitive learning;

    机译:基于模糊规则的分类系统;大数据;MapReduce;Hadoop;数据集不平衡;成本敏感型学习;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号