...
首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Feature Selection for Optimized High-Dimensional Biomedical Data Using an Improved Shuffled Frog Leaping Algorithm
【24h】

Feature Selection for Optimized High-Dimensional Biomedical Data Using an Improved Shuffled Frog Leaping Algorithm

机译:使用改进的随机蛙跳算法优化高维生物医学数据的特征选择

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

High dimensional biomedical datasets contain thousands of features which can be used in molecular diagnosis of disease, however, such datasets contain many irrelevant or weak correlation features which influence the predictive accuracy of diagnosis. Without a feature selection algorithm, it is difficult for the existing classification techniques to accurately identify patterns in the features. The purpose of feature selection is to not only identify a feature subset from an original set of features [without reducing the predictive accuracy of classification algorithm] but also reduce the computation overhead in data mining. In this paper, we present our improved shuffled frog leaping algorithm which introduces a chaos memory weight factor, an absolute balance group strategy, and an adaptive transfer factor. Our proposed approach explores the space of possible subsets to obtain the set of features that maximizes the predictive accuracy and minimizes irrelevant features in high-dimensional biomedical data. To evaluate the effectiveness of our proposed method, we have employed the K-nearest neighbor method with a comparative analysis in which we compare our proposed approach with genetic algorithms, particle swarm optimization, and the shuffled frog leaping algorithm. Experimental results show that our improved algorithm achieves improvements in the identification of relevant subsets and in classification accuracy.
机译:高维生物医学数据集包含可用于疾病分子诊断的数千个特征,但是,此类数据集包含许多不相关或较弱的相关特征,这些特征会影响诊断的预测准确性。没有特征选择算法,现有的分类技术很难准确地识别特征中的图案。特征选择的目的不仅是从原始特征集中识别特征子集(而不会降低分类算法的预测精度),而且还可以减少数据挖掘中的计算开销。在本文中,我们提出了改进的改组蛙跳算法,引入了混沌记忆权重因子,绝对平衡群策略和自适应传递因子。我们提出的方法探索了可能的子集的空间以获得一组特征,这些特征可以最大程度地提高预测准确性,并最大程度地减少高维生物医学数据中不相关的特征。为了评估我们提出的方法的有效性,我们将K最近邻方法与比较分析进行了比较,在分析中,我们将我们提出的方法与遗传算法,粒子群优化算法和改组的蛙跳算法进行了比较。实验结果表明,我们改进的算法在识别相关子集和分类准确度方面取得了进步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号