首页> 外文会议>International Conference on Artificial Intelligence and Soft Computing(ICAISC 2006); 20060625-29; Zakopane(PL) >Feature Selection and Ranking of Key Genes for Tumor Classification: Using Microarray Gene Expression Data
【24h】

Feature Selection and Ranking of Key Genes for Tumor Classification: Using Microarray Gene Expression Data

机译:肿瘤分类关键基因的特征选择和排序:使用微阵列基因表达数据

获取原文
获取原文并翻译 | 示例

摘要

In this paper we perform a t-test for significant gene expression analysis in different dimensions based on molecular profiles from microarray data, and compare several computational intelligent techniques for classification accuracy on Leukemia, Lymphoma and Prostate cancer datasets of broad institute and Colon cancer dataset from Princeton gene expression project. Classification accuracy is evaluated with Linear genetic Programs, Multivariate Regression Splines (MARS), Classification and Regression Tress (CART) and Random Forests. Linear Genetic Programs and Random forests perform the best for detecting malignancy of different tumors. Our results demonstrate the potential of using learning machines in diagnosis of the malignancy of a tumor. We also address the related issue of ranking the importance of input features, which is itself a problem of great interest. Elimination of the insignificant inputs (genes) leads to a simplified problem and possibly faster and more accurate classification of microarray gene expression data. Experiments on select cancer datasets have been carried out to assess the effectiveness of this criterion. Results show that using significant features gives the most remarkable performance and performs consistently well over microarray gene expression datasets we used. The classifiers used perform the best using the most significant features expect for Prostate cancer dataset.
机译:在本文中,我们根据微阵列数据中的分子概况对不同维度的重要基因表达进行了t检验,并比较了多种计算智能技术对广泛机构的白血病,淋巴瘤和前列腺癌数据集以及来自结肠癌的数据集的分类准确性普林斯顿基因表达计划。使用线性遗传程序,多元回归样条(MARS),分类回归树(CART)和随机森林评估分类准确性。线性遗传程序和随机森林在检测不同肿瘤的恶性肿瘤方面表现最佳。我们的结果证明了使用学习机诊断肿瘤恶性肿瘤的潜力。我们还解决了有关对输入功能的重要性进行排名的相关问题,这本身就是引起人们极大兴趣的问题。消除无关紧要的输入(基因)会导致问题简化,并且可能会更快,更准确地对微阵列基因表达数据进行分类。已经对选定的癌症数据集进行了实验,以评估该标准的有效性。结果表明,使用重要功能可提供最出色的性能,并且与我们使用的微阵列基因表达数据集一致地表现良好。使用的分类器使用对前列腺癌数据集期望的最重要的功能来表现最好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号