首页> 外文会议>International Conference on Advanced Communication Technology >MF-GARF: Hybridizing Multiple Filters and GA Wrapper for Feature Selection of Microarray Cancer Datasets
【24h】

MF-GARF: Hybridizing Multiple Filters and GA Wrapper for Feature Selection of Microarray Cancer Datasets

机译:MF-GARF:杂交多个过滤器和GA包装器以选择微阵列癌症数据集的特征

获取原文

摘要

DNA Microarray technology is a valuable advancement in medical field but it gives birth to many challenges like curse of dimensionality, storage and computational requirements. In this paper we have proposed, a multiple filters and GA wrapper based hybrid approach (MF-GARF) that incorporates Random forest as fitness evaluator of features. The proposed hybrid approach MF-GARF is comprised of three phases relevancy block; containing information theory based filters Information Gain, Gain Ratio and Gini Index, responsible for ensuring relevancy and removal of irrelevant and noisy features. Second phase is Redundancy block; incorporating Pearson Correlation statistics to remove redundancy among features, and then final phase Optimization Block; containing Genetic Algorithm wrapper with Random Forest as fitness evaluator, responsible for generating an optimal feature subset with high predictive power. Random Forest with 10-fold cross validation is used to calculate the classification accuracy of selected feature subset. Experiments are carried out on 7 publically available benchmark Microarray cancer datasets and the proposed algorithm has achieved good accuracy with minimal selected features for all datasets. The comparison with other state of the art hybrid techniques validates the effectiveness of our proposed approach.
机译:DNA微阵列技术是医学领域的宝贵进步,但它带来了许多挑战,例如尺寸,存储和计算需求的诅咒。在本文中,我们提出了一种基于多过滤器和基于GA包装的混合方法(MF-GARF),该方法结合了随机森林作为特征的适应性评估器。提出的混合方法MF-GARF由三个阶段的相关模块组成。包含基于信息论的过滤器信息增益,增益比和基尼系数,负责确保相关性和不相关和嘈杂特征的消除。第二阶段是冗余块;合并Pearson Correlation统计信息以消除功能之间的冗余,然后合并最后阶段的Optimization Block;包含遗传算法包装器,其中随机森林作为适应性评估器,负责生成具有高预测能力的最佳特征子集。具有10倍交叉验证的随机森林用于计算所选要素子集的分类准确性。实验在7个公开可用的基准微阵列癌症数据集上进行,所提出的算法在所有数据集的选择特征最少的情况下实现了良好的准确性。与其他现有技术的混合技术的比较验证了我们提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号