首页> 外文期刊>Neural computing & applications >Classification and diagnostic prediction of prostate cancer using gene expression and artificial neural networks
【24h】

Classification and diagnostic prediction of prostate cancer using gene expression and artificial neural networks

机译:基因表达和人工神经网络对前列腺癌进行分类和诊断预测

获取原文
获取原文并翻译 | 示例
           

摘要

Prostate cancer is the fourth most common cancer among all cancers and the second most common cancer in men. The rate of increase in prostate cancer incidence is higher than the overall increase of cancer incidents. 68% of prostate cancer cases are from developed countries. There has been very little research on the most suitable techniques for analysing prostate cancer gene expression datasets to identify those genes that may be most related to prostate cancer. This paper attempts to identify significant (influential) attributes in a well-established prostate cancer gene expression dataset consisting of over 12,533 attributes for 102 samples (50 normal, 52 tumour). Several (7) different statistical and artificial intelligence (AI)-based feature selection methods were paired with four different classifiers, namely ANNs, Naive Bayes, AdaBoost and J48. Prediction experiments are carried using ANNs with unseen sample testing. In our experiments, ANNs outperformed all other approaches for classification with sequential forward feature selection (SFFS), achieving 100% accuracy. Naive Bayes and AdaBoost achieved best accuracy of 96.3 and 93.13% with support vector machine (SVM) attribute selection, whereas J48 could get only 89.21% with SFFS approach. For prediction experiments, ANNs obtained an accuracy of 95.1% with SVM attribute selection (correctly predicting 96 out of 102 samples). Finally, by investigating National Center for Biotechnology Information database it is found that 21 out of 24 attributes (87.5%) that belong to SVM attribute selection have a reference to cancer/tumour, thereby establishing a link between feature selection and biological plausibility. The main contribution of this paper is in identifying the importance of pairing the most appropriate feature selection strategy with the most appropriate classification strategy when dealing with significantly underdetermined data. This paper also emphasizes differences and similarities between the influence of classification and prediction of prostate cancer. There is another new approach we considered while doing the classification and prediction experiments. Apart from using 7 different feature selection approaches, we have derived new set of attributes by adding all attributes (union), selecting common attributes (intersection) and rest of the attributes (not common).
机译:前列腺癌是所有癌症中最常见的癌症和男性中最常见的癌症。前列腺癌发病率的增加率高于癌症事件的总体增加。 68%的前列腺癌病例来自发达国家。对分析前列腺癌基因表达数据集的最合适技术几乎没有研究,以确定与前列腺癌最相关的那些基因。本文试图识别成熟的前列腺癌基因表达数据集中的重要(有影响力)属性,其由12,533个属性组成的102个样品(50例正常,52个肿瘤)。几(7)个不同的统计和人工智能(AI)基础的特征选择方法与四种不同的分类器,即Anns,Naive Bayes,Adaboost和J48配对。预测实验使用具有看不见的样品测试的ANN。在我们的实验中,ANNS优于与顺序前进特征选择(SFF)进行分类的所有其他方法,实现100%的精度。 Naive Bayes和Adaboost通过支持向量机(SVM)属性选择实现了96.3和93.13%的最佳精度,而J48则可以使用SFFS方法获得89.21%。对于预测实验,ANNS通过SVM属性选择获得95.1%的精度(正确预测102个样本中的96个)。最后,通过调查国家生物技术信息数据库,发现属于SVM属性选择的24个属性(87.5%)中的21种具有参考癌症/肿瘤,从而建立特征选择和生物合理性之间的联系。本文的主要贡献在于在处理明显未确定的数据时将最合适的特征选择策略配对最合适的特征选择策略的重要性。本文还强调了前列腺癌分类和预测的影响之间的差异和相似性。在进行分类和预测实验时,我们考虑了另一种新方法。除了使用7种不同的特征选择方法,我们通过添加所有属性(Union),选择常见属性(交叉点)以及属性的其余部分来派生新的属性集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号