首页> 外文期刊>Neurocomputing >A hybrid method based on ensemble WELM for handling multi class imbalance in cancer microarray data
【24h】

A hybrid method based on ensemble WELM for handling multi class imbalance in cancer microarray data

机译:基于集合WELM的混合方法处理癌症微阵列数据中的多类不平衡

获取原文
获取原文并翻译 | 示例
           

摘要

DNA microarray technology provides an efficient way to diagnose cancer. However, microarray gene expression data face the challenges of class imbalance and high dimension. The class imbalance problem usually leads to inaccurate results when using traditional feature selection and classification algorithms. Due to fast learning speed and good classification performance, extreme learning machine (ELM) has become one of the best classification algorithms and weighted ELM has been recently presented to deal with the class imbalance. However, they ignored the negative impact of imbalanced feature set. This paper proposes a hybrid method based on WELM to handle the multi class imbalance problem of cancer microarray data at both feature and algorithmic levels. At feature level, a corrected feature subset is searched for each class using class oriented feature selection method, so that the features correlated with the minority class are explicitly selected. At algorithmic level, WELM is further modified to strengthen the input nodes with high discrimination power, and an ensemble model is trained to improve the generalization. That is, multiple modified WELM models are trained on the datasets characterized by different feature subsets; in order to encourage the ensemble diversity, the models with low dissimilarity are removed and the reserved ones are combined as an ensemble model. The experiments are conducted on eight gene expression datasets with multiple cancer types and classification results show that our method significantly outperforms ELM and several recent works. (C) 2017 Elsevier B.V. All rights reserved.
机译:DNA微阵列技术提供了诊断癌症的有效方法。然而,微阵列基因表达数据面临类不平衡和高维的挑战。使用传统的特征选择和分类算法时,类不平衡问题通常会导致结果不准确。由于快速的学习速度和良好的分类性能,极限学习机(ELM)已成为最好的分类算法之一,最近提出了加权ELM来解决班级失衡问题。但是,他们忽略了功能不平衡带来的负面影响。本文提出了一种基于WELM的混合方法,可以在特征和算法水平上处理癌症微阵列数据的多类不平衡问题。在特征级别,使用面向类的特征选择方法为每个类搜索校正后的特征子集,以便显式选择与少数类相关的特征。在算法级别,对WELM进行了进一步修改,以增强具有高判别能力的输入节点,并训练了集成模型以提高泛化能力。也就是说,在以不同特征子集为特征的数据集上训练了多个修改的WELM模型。为了促进集合的多样性,去除了相异度低的模型,并将保留的模型合并为一个集合模型。在具有多种癌症类型的八个基因表达数据集上进行了实验,分类结果表明,我们的方法明显优于ELM和最近的一些工作。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2017年第29期|641-650|共10页
  • 作者单位

    Guangdong Pharmaceut Univ, Sch Med Informat Engn, Guangzhou 510006, Guangdong, Peoples R China;

    Guangdong Pharmaceut Univ, Sch Med Informat Engn, Guangzhou 510006, Guangdong, Peoples R China;

    Guangdong Pharmaceut Univ, Sch Med Informat Engn, Guangzhou 510006, Guangdong, Peoples R China;

    South China Univ Technol, Informat & Network Engn & Res Ctr, Guangzhou 510006, Guangdong, Peoples R China;

    Guangdong Pharmaceut Univ, Sch Med Informat Engn, Guangzhou 510006, Guangdong, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Multi class imbalance; Extreme learning machine; High dimension; Feature selection; Ensemble learning;

    机译:多班级失衡;极端学习机;高维度;功能选择;综合学习;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号