首页> 外文期刊>World Allergy Organization Journal >Classification of Widely and Rarely Expressed Genes with Recurrent Neural Network
【24h】

Classification of Widely and Rarely Expressed Genes with Recurrent Neural Network

机译:递归神经网络对广泛表达和罕见表达基因的分类

获取原文
       

摘要

A tissue-specific gene expression shapes the formation of tissues, while gene expression changes reflect the immune response of the human body to environmental stimulations or pressure, particularly in disease conditions, such as cancers. A few genes are commonly expressed across tissues or various cancers, while others are not. To investigate the functional differences between widely and rarely expressed genes, we defined the genes that were expressed in 32 normal tissues/cancers (i.e., called widely expressed genes; FPKM 1 in all samples) and those that were not detected (i.e., called rarely expressed genes; FPKM 1 in all samples) based on the large gene expression data set provided by Uhlen et al. Each gene was encoded using the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment scores. Minimum redundancy maximum relevance (mRMR) was used to measure and rank these features on the mRMR feature list. Thereafter, we applied the incremental feature selection method with a supervised classifier recurrent neural network (RNN) to select the discriminate features for classifying widely expressed genes from rarely expressed genes and construct an optimum RNN classifier. The Youden's indexes generated by the optimum RNN classifier and evaluated using a 10-fold cross validation were 0.739 for normal tissues and 0.639 for cancers. Furthermore, the underlying mechanisms of the key discriminate GO and KEGG features were analyzed. Results can facilitate the identification of the expression landscape of genes and elucidation of how gene expression shapes tissues and the microenvironment of cancers.
机译:组织特异性基因表达改变了组织的形成,而基因表达的变化反映了人体对环境刺激或压力的免疫反应,特别是在疾病条件下,例如癌症。一些基因通常在组织或各种癌症中表达,而另一些则不。为了研究广泛表达和很少表达的基因之间的功能差异,我们定义了在32种正常组织/癌症中表达的基因(即称为广泛表达的基因;在所有样品中FPKM> 1)和未检测到的基因(即称为基于Uhlen等提供的大型基因表达数据集,很少表达基因;所有样品中FPKM <1)。每个基因均使用基因本体论(GO)和《京都议定书》的基因与基因组百科全书(KEGG)富集评分进行编码。最小冗余最大相关性(mRMR)用于在mRMR功能列表中测量和排列这些功能。此后,我们应用带有监督分类器递归神经网络(RNN)的增量特征选择方法,从很少表达的基因中选择用于区分广泛表达的基因的区分特征,并构建最佳的RNN分类器。由最佳RNN分类器生成并使用10倍交叉验证进行评估的Youden指数对于正常组织为0.739,对于癌症为0.639。此外,分析了区分GO和KEGG特征的关键机制。结果可以帮助鉴定基因的表达格局,并阐明基因表达如何塑造组织和癌症的微环境。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号