首页> 外文会议>Advances in knowledge discovery and data mining >Attribute Selection and Classification of Prostate Cancer Gene Expression Data Using Artificial Neural Networks
【24h】

Attribute Selection and Classification of Prostate Cancer Gene Expression Data Using Artificial Neural Networks

机译:基于人工神经网络的前列腺癌基因表达数据的属性选择和分类

获取原文
获取原文并翻译 | 示例

摘要

Artificial Intelligence (AI) approaches for medical diagnosis and prediction of cancer are important and ever growing areas of research. Artificial Neural Networks (ANN) is one such approach that have been successfully applied in these areas. Various types of clinical datasets have been used in intelligent decision making systems for medical diagnosis, especially cancer for over three decades. However, gene expression datasets are complex with large numbers of attributes which make it more difficult for AI approaches to classification and prediction. Prostate Cancer dataset is one such dataset with 12600 attributes and only 102 samples. In this paper, we propose an extended ANN based approach for classification and prediction of prostate cancer using gene expression data. Firstly, we use four attribute selection approaches, namely Sequential Floating Forward Selection (SFFS), RELIEFF, Sequential Backward Feature Section (SFBS) and Significant Attribute Evaluation (SAE) to identify the most influential attributes among 12600. We use ANNs and Naive Bayes for classification with complete sets of attributes as well as various sets obtained from attribute selection methods. Experimental results show that ANN outperformed Naive Bayes by achieving a classification accuracy of 98.2 % compared to 62.74 % with the full set of attributes. Further, with 21 selected attributes obtained with SFFS, ANNs achieved better accuracy (100%) for classification compared to Naive Bayes. For prediction using ANNs, SFFS was able achieve best results with 92.31 % of accuracy by correctly predicting 24 out of 26 samples provided for independent sample testing. Moreover, some of the gene selected by SFFS are identified to have a direct reference to cancer and tumour. Our results indicate that a combination of standard feature selection methods in conjunction with ANNs provide the most impressive results.
机译:用于医学诊断和预测癌症的人工智能(AI)方法是重要且不断发展的研究领域。人工神经网络(ANN)是一种已成功应用于这些领域的方法。在医疗诊断(尤其是癌症)的智能决策系统中,已使用了各种类型的临床数据集长达三十多年。但是,基因表达数据集复杂且具有大量属性,这使得AI方法难以进行分类和预测。前列腺癌数据集就是这样一种数据集,具有12600个属性,并且只有102个样本。在本文中,我们提出了使用扩展的基于ANN的方法来使用基因表达数据对前列腺癌进行分类和预测。首先,我们使用四种属性选择方法,即顺序浮动前向选择(SFFS),救济,顺序后向特征部分(SFBS)和重要属性评估(SAE)来识别12600中最具影响力的属性。我们将ANN和朴素贝叶斯用于具有完整属性集的分类以及从属性选择方法获得的各种集合。实验结果表明,人工神经网络的分类准确率达到98.2%,而完整属性集则为62.74%,优于朴素贝叶斯。此外,与朴素贝叶斯相比,通过SFFS获得的21种选定属性,人工神经网络的分类精度更高(100%)。对于使用人工神经网络进行的预测,SFFS通过正确预测提供给独立样本测试的26个样本中的24个,能够以92.31%的准确度获得最佳结果。此外,经SFFS选择的某些基因被确定与癌症和肿瘤直接相关。我们的结果表明,将标准特征选择方法与ANN结合使用可提供最令人印象深刻的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号