...
首页> 外文期刊>Scientific reports. >Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks
【24h】

Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks

机译:细胞识别码:使用深度神经网络从基因表达谱了解细胞识别

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Understanding cell identity is an important task in many biomedical areas. Expression patterns of specific marker genes have been used to characterize some limited cell types, but exclusive markers are not available for many cell types. A second approach is to use machine learning to discriminate cell types based on the whole gene expression profiles (GEPs). The accuracies of simple classification algorithms such as linear discriminators or support vector machines are limited due to the complexity of biological systems. We used deep neural networks to analyze 1040 GEPs from 16 different human tissues and cell types. After comparing different architectures, we identified a specific structure of deep autoencoders that can encode a GEP into a vector of 30 numeric values, which we call the cell identity code (CIC). The original GEP can be reproduced from the CIC with an accuracy comparable to technical replicates of the same experiment. Although we use an unsupervised approach to train the autoencoder, we show different values of the CIC are connected to different biological aspects of the cell, such as different pathways or biological processes. This network can use CIC to reproduce the GEP of the cell types it has never seen during the training. It also can resist some noise in the measurement of the GEP. Furthermore, we introduce classifier autoencoder, an architecture that can accurately identify cell type based on the GEP or the CIC.
机译:了解细胞身份是许多生物医学领域的重要任务。特定标记基因的表达模式已被用来表征某些有限的细胞类型,但排他标记不适用于许多细胞类型。第二种方法是使用机器学习基于整个基因表达谱(GEP)来区分细胞类型。由于生物系统的复杂性,简单分类算法(例如线性鉴别器或支持向量机)的准确性受到限制。我们使用深度神经网络分析了来自16种不同人体组织和细胞类型的1040个GEP。在比较了不同的体系结构之后,我们确定了一种深层自动编码器的特定结构,该结构可以将GEP编码为30个数值的向量,我们将其称为单元标识码(CIC)。原始GEP可以从CIC复制,其准确性可与同一实验的技术复制品相媲美。尽管我们使用无监督的方法来训练自动编码器,但我们显示出CIC的不同值与细胞的不同生物学方面(例如不同的途径或生物学过程)相关。该网络可以使用CIC来复制训练期间从未见过的细胞类型的GEP。它也可以抵抗GEP测量中的一些噪声。此外,我们介绍了分类器自动编码器,它是一种可以根据GEP或CIC准确识别单元类型的体系结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号