首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Establishment and Analysis of a Combined Diagnostic Model of Liver Cancer with Random Forest and Artificial Neural Network
【24h】

Establishment and Analysis of a Combined Diagnostic Model of Liver Cancer with Random Forest and Artificial Neural Network

机译:基于随机森林和人工神经网络的肝癌联合诊断模型的建立与分析

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The incidence of liver cancer (hepatocellular carcinoma; HCC) is rising and with poor clinical outcome expected, a more accurate judgment of tumor tissues and adjacent nontumor tissues is necessary. The aim of this study was to construct a diagnostic model based on random forest (RF) and artificial neural network (ANN). It can be used to aid in the identification of diseased tissue such as cancerous tissue, for HCC clinical diagnosis and surgical guidance. GSE36376 and GSE121248 from Gene Expression Omnibus (GEO) were used as training sets in this investigation. R package “limma” and WGCNA were used to filter the training set for statistically significant p<0.05 differential genes. To better understand the biological function and characteristics, R software was used to perform GO and KEGG enrichment analyses. To pick out and further understand the key genes, we performed PPI analysis and random forest tree analysis. Next, we built the ANN to predict training sets and validation set (GSE84402), and ROC curve was plotted to calculate area under curve (AUC). Then immune cell infiltration indicated difference of immune cell subsets between control and case groups. Finally, the survival analysis of key genes was also carried out based on data in TCGA database. Based on the expression of these 9 genes, we built the artificial neural network (ANN) and the accuracy of the final models was assessed with an ROC curve. The areas under the ROC curve were 0.984 (95 CI 0.972–0.993) in training sets. Its predictive capability was further assessed using the validation set. And the areas under the ROC curve were 0.929 (95 CI 0.786–1.000). In summary, this method effectively classifies hepatocellular carcinoma tissues and the corresponding noncancerous tissues and provides reasonable new ideas for the early diagnosis of liver cancer in the future.
机译:肝癌(肝细胞癌;HCC)正在上升,预计临床结果较差,因此需要对肿瘤组织和邻近的非肿瘤组织进行更准确的判断。本研究旨在构建基于随机森林(RF)和人工神经网络(ANN)的诊断模型。它可用于帮助识别病变组织,如癌组织,用于HCC临床诊断和手术指导。本研究使用来自基因表达综合 (GEO) 的 GSE36376 和 GSE121248作为训练集。R 包“limma”和 WGCNA 用于过滤训练集中具有统计学意义的 p<0.05 差异基因。为了更好地了解其生物学功能和特性,使用R软件进行GO和KEGG富集分析。为了挑选和进一步了解关键基因,我们进行了PPI分析和随机林木分析。接下来,我们构建了 ANN 来预测训练集和验证集 (GSE84402),并绘制了 ROC 曲线来计算曲线下面积 (AUC)。免疫细胞浸润表明对照组和病例组免疫细胞亚群存在差异。最后,基于TCGA数据库中的数据对关键基因进行存活分析。基于这 9 个基因的表达,我们构建了人工神经网络 (ANN),并用 ROC 曲线评估最终模型的准确性。ROC曲线下面积为0.984(95%CI 0.972-0。993) 在训练集中。使用验证集进一步评估其预测能力。ROC曲线下面积为0.929(95%CI 0.786–1.000)。综上所述,该方法有效地对肝细胞癌组织和相应的非癌组织进行了分类,为今后肝癌的早期诊断提供了合理的新思路。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号