首页> 外文期刊>Neural computing & applications >A snapshot neural ensemble method for cancer-type prediction based on copy number variations
【24h】

A snapshot neural ensemble method for cancer-type prediction based on copy number variations

机译:基于拷贝数变体的癌症型预测的快照神经集合方法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

An accurate diagnosis and prognosis for cancer are specific to patients with particular cancer types and molecular traits, which needs to address carefully. The discovery of important biomarkers is becoming an important step toward understanding the molecular mechanisms of carcinogenesis in which genomics data and clinical outcomes need to be analyzed before making any clinical decision. Copy number variations (CNVs) are found to be associated with the risk of individual cancers and hence can be used to reveal genetic predispositions before cancer develops. In this paper, we collect the CNVs data about 8000 cancer patients covering 14 different cancer types from The Cancer Genome Atlas. Then, two different sparse representations of CNVs based on 578 oncogenes and 20,308 protein-coding genes, including genomic deletions and duplication across the samples, are prepared. Then, we train Conv-LSTM and convolutional autoencoder (CAE) networks using both representations and create snapshot models. While the Conv-LSTM can capture locally and globally important features, CAE can utilize unsupervised pretraining to initialize the weights in the subsequent convolutional layers against the sparsity. Model averaging ensemble (MAE) is then applied to combine the snapshot models in order to make a single prediction. Finally, we identify most significant CNVs biomarkers using guided-gradient class activation map plus (GradCAM++) and rank top genes for different cancer types. Results covering several experiments show fairly high prediction accuracies for the majority of cancer types. In particular, using protein-coding genes, Conv-LSTM and CAE networks can predict cancer types correctly at least 72.96% and 76.77% of the cases, respectively. Contrarily, using oncogenes gives moderately higher accuracies of 74.25% and 78.32%, whereas the snapshot model based on MAE shows overall 2.5% of accuracy improvement.
机译:癌症的准确诊断和预后是特定癌症类型和分子特征的患者,需要仔细解决。重要的生物标志物的发现正在成为理解致癌作用的分子机制的重要步骤,在制作任何临床决策之前需要分析基因组学数据和临床结果。发现拷贝数变异(CNV)与个体癌症的风险有关,因此可用于揭示癌症开发前的遗传易感性。在本文中,我们从癌症基因组地图集收集约8000例癌症患者的CNVS数据。然后,制备基于578个癌基因的CNV和20,308个蛋白质编码基因的两种不同的CNV稀疏表示,包括在样品中包括基因组缺失和复制。然后,我们使用两个表示来培训Conv-LSTM和卷积的AutoEncoder(CAE)网络并创建快照模型。虽然Conv-LSTM可以在本地和全球重要的特征中捕获,但CAE可以利用无监测的预测来初始化随后卷积层中的重量抵抗稀疏性。然后应用模型平均集合(MAE)以组合快照模型以进行单一预测。最后,我们使用指导梯度类激活地图加(Gradcam ++)和不同癌症类型的排名顶部基因识别最重要的CNVS生物标志物。结果涵盖了几个实验表明了大多数癌症类型的预测准确性。特别地,使用蛋白质编码基因,Conv-LSTM和CAE网络可以分别预测癌症类型至少72.96%和76.77%。相反,使用癌脓酶具有74.25%和78.32%的适度更高的精度,而基于MAE的快照模型显示总体的准确性改善的2.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号