首页> 外文会议>International Conference on High Performance Computing and Applications >Filter vs. Wrapper approach for optimum gene selection of high dimensional gene expression dataset: An analysis with cancer datasets
【24h】

Filter vs. Wrapper approach for optimum gene selection of high dimensional gene expression dataset: An analysis with cancer datasets

机译:用于高维基因表达数据集的最佳基因选择的Filter vs. Wrapper方法:癌症数据集分析

获取原文

摘要

In Bioinformatics, gene dataset experiments are generating thousands of gene expression measurements, which generally used to collect information from tissue and cell samples regarding gene expression differences. Optimum gene selection from such gene expression datasets and their classification plays an important role for disease prediction & diagnosis. Further the task ahead to understand that, what is the best way of gene selection to get maximum classification accuracy from such high dimensional gene expression dataset, whether the filter is the best way to rely upon or wrapper approach can be the best suitable, beyond that which classifier works well with filter and with wrapper? To answer the question, in this paper, the performance of the filter vs. wrapper gene selection technique is being evaluated by supervised classifiers over three well known public domain datasets viz. Ovarian Cancer, Lymphomas & Leukemia. For optimal gene selection, ReliefF method is used as a filter based gene selection and Random gene subset selection algorithm is used as a wrapper based gene selection. For classification, different linear as well as an ensemble classifiers have been tested for their performances. This paper also tries to bring the fact of timing details so that through analysis, it can get derived upon that which approach is more appropriate for better time management as well as with high accuracy of the selected dataset.
机译:在生物信息学中,基因数据集实验正在生成数千个基因表达测量值,这些测量值通常用于从组织和细胞样本中收集有关基因表达差异的信息。从这样的基因表达数据集及其分类中选择最佳基因对于疾病的预测和诊断起着重要的作用。进一步的任务是要理解,从这样的高维基因表达数据集获得最大分类准确性的最佳基因选择方法是,筛选器是依赖的​​最佳方法还是包装方法可能是最合适的。哪个分类器与过滤器和包装器一起使用效果最佳?为了回答这个问题,在本文中,过滤器与包装器基因选择技术的性能正在由监督分类器对三个众所周知的公共领域数据集进行评估。卵巢癌,淋巴瘤和白血病。对于最佳基因选择,ReliefF方法用作基于过滤器的基因选择,而随机基因子集选择算法用作基于包装器的基因选择。对于分类,已经测试了不同的线性分类器和整体分类器的性能。本文还尝试介绍计时细节这一事实,以便通过分析可以得出,哪种方法更适合于更好的时间管理以及所选择数据集的高精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号