首页> 美国卫生研究院文献>Biomolecules >Colorectal Cancer Prediction Based on Weighted Gene Co-Expression Network Analysis and Variational Auto-Encoder
【2h】

Colorectal Cancer Prediction Based on Weighted Gene Co-Expression Network Analysis and Variational Auto-Encoder

机译:基于加权基因共表达网络分析和变分式自动编码器的结直肠癌预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

An effective feature extraction method is key to improving the accuracy of a prediction model. From the Gene Expression Omnibus (GEO) database, which includes 13,487 genes, we obtained microarray gene expression data for 238 samples from colorectal cancer (CRC) samples and normal samples. Twelve gene modules were obtained by weighted gene co-expression network analysis (WGCNA) on 173 samples. By calculating the Pearson correlation coefficient (PCC) between the characteristic genes of each module and colorectal cancer, we obtained a key module that was highly correlated with CRC. We screened hub genes from the key module by considering module membership, gene significance, and intramodular connectivity. We selected 10 hub genes as a type of feature for the classifier. We used the variational autoencoder (VAE) for 1159 genes with significantly different expressions and mapped the data into a 10-dimensional representation, as another type of feature for the cancer classifier. The two types of features were applied to the support vector machines (SVM) classifier for CRC. The accuracy was 0.9692 with an AUC of 0.9981. The result shows a high accuracy of the two-step feature extraction method, which includes obtaining hub genes by WGCNA and a 10-dimensional representation by variational autoencoder (VAE).
机译:有效特征提取方法是提高预测模型精度的关键。来自包括13,487个基因的基因表达综合征(Geo)数据库,我们获得了来自结肠直肠癌(CRC)样品和正常样品的238个样品的微阵列基因表达数据。在173个样品上通过加权基因共表达网络分析(WGCNA)获得12个基因模块。通过在每个模块和结直肠癌的特征基因之间计算Pearson相关系数(PCC),我们获得了与CRC高度相关的关键模块。通过考虑模块成员,基因意义和岩岩体连接,我们从关键模块中屏蔽了集线器基因。我们选择了10个集线器基因作为分类器的一种功能。我们使用了具有显着不同表达的1159个基因的变形AutoEncoder(VAE),并将数据映射到10维表示,作为癌症分类器的另一种类型的特征。两种类型的特征被应用于CRC的支持向量机(SVM)分类器。精度为0.9692,AUC为0.9981。结果显示了两步特征提取方法的高精度,其包括通过WGCNA获得集线器基因和由变形AutiaceCoder(VAE)的10维表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号