首页> 外文OA文献 >A hybrid genetic algorithm and support vector machine classifier for feature selection and classification of gene expression
【2h】

A hybrid genetic algorithm and support vector machine classifier for feature selection and classification of gene expression

机译:用于基因表达特征选择和分类的混合遗传算法和支持向量机分类器

摘要

Advancement in gene expression technology offers the ability to measure the expression levels of thousand of genes in parallel. Gene expression microarray data is expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. Key issues that need to be addressed under such circumstances are the efficient selection of a small subset of genes that might profoundly contribute to disease identification from the thousand of genes measured on microarrays that are inherently noisy. This research deals with finding a small subset of informative genes from gene expression data which maximizes the classification accuracy. This research proposed a hybrid between Genetic Algorithm and Support Vector Machine classifier for selecting an optimal small subset of informative genes and classifying the optimal subset. Two benchmark data sets were used to evaluate the usefulness of the approach for small and high dimension data. Although, the experimental results showed that the hybrid method performed better than some of the best previous methods on small dimensional data, its performance deteriorated significantly on the higher dimensional data. An improved version of the hybrid method was designed by introducing a new algorithm for features selection based on improved chromosome representation to replace the original algorithm on the hybrid method which appeared to perform poorly on high dimensional data. The results of the gene expression microarray classification demonstrated that the proposed method performed better than the original and the previous methods. The informative genes from the experiment results proved to be biologically plausible when compared with the biological results produced from biologist and computer scientist researches.
机译:基因表达技术的进步提供了并行测量数千个基因表达水平的能力。基因表达微阵列数据有望显着帮助开发有效的癌症诊断和分类平台。在这种情况下需要解决的关键问题是从基因芯片上测量的数千个固有噪声的基因中有效选择一小部分可能对疾病识别有深远影响的基因。这项研究致力于从基因表达数据中找到一小部分信息基因,从而最大程度地提高分类准确性。这项研究提出了遗传算法和支持向量机分类器的混合体,用于选择信息基因的最佳小子集并对最佳子集进行分类。使用两个基准数据集来评估该方法对小尺寸和高尺寸数据的有用性。尽管实验结果表明,在小尺寸数据上,混合方法的性能比以前最好的方法好,但在高尺寸数据上,其性能却明显下降。通过引入一种新的基于改进的染色体表示的特征选择算法来设计混合方法的改进版本,以取代原来在混合方法上似乎对高维数据表现不佳的算法。基因表达微阵列分类的结果表明,所提出的方法比原来和以前的方法表现更好。与生物学家和计算机科学家研究产生的生物学结果相比,实验结果提供的信息基因具有生物学上的合理性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号