首页> 外文期刊>BMC Genomics >Detection of high variability in gene expression from single-cell RNA-seq profiling
【24h】

Detection of high variability in gene expression from single-cell RNA-seq profiling

机译:从单细胞RNA序列分析中检测基因表达的高变异性

获取原文
           

摘要

Background The advancement of the next-generation sequencing technology enables mapping gene expression at the single-cell level, capable of tracking cell heterogeneity and determination of cell subpopulations using single-cell RNA sequencing (scRNA-seq). Unlike the objectives of conventional RNA-seq where differential expression analysis is the integral component, the most important goal of scRNA-seq is to identify highly variable genes across a population of cells, to account for the discrete nature of single-cell gene expression and uniqueness of sequencing library preparation protocol for single-cell sequencing. However, there is lack of generic expression variation model for different scRNA-seq data sets. Hence, the objective of this study is to develop a gene expression variation model (GEVM), utilizing the relationship between coefficient of variation (CV) and average expression level to address the over-dispersion of single-cell data, and its corresponding statistical significance to quantify the variably expressed genes (VEGs). Results We have built a simulation framework that generated scRNA-seq data with different number of cells, model parameters, and variation levels. We implemented our GEVM and demonstrated the robustness by using a set of simulated scRNA-seq data under different conditions. We evaluated the regression robustness using root-mean-square error (RMSE) and assessed the parameter estimation process by varying initial model parameters that deviated from homogeneous cell population. We also applied the GEVM on real scRNA-seq data to test the performance under distinct cases. Conclusions In this paper, we proposed a gene expression variation model that can be used to determine significant variably expressed genes. Applying the model to the simulated single-cell data, we observed robust parameter estimation under different conditions with minimal root mean square errors. We also examined the model on two distinct scRNA-seq data sets using different single-cell protocols and determined the VEGs. Obtaining VEGs allowed us to observe possible subpopulations, providing further evidences of cell heterogeneity. With the GEVM, we can easily find out significant variably expressed genes in different scRNA-seq data sets.
机译:背景技术下一代测序技术的进步使得能够在单细胞水平上绘制基因表达图谱,从而能够使用单细胞RNA测序(scRNA-seq)追踪细胞异质性并确定细胞亚群。不同于常规RNA-seq的目标(差异表达分析是不可或缺的组成部分),scRNA-seq的最重要目标是在整个细胞群体中鉴定高度可变的基因,以解释单细胞基因表达和单细胞测序的测序文库制备方案的唯一性。但是,缺乏针对不同的scRNA-seq数据集的通用表达变异模型。因此,本研究的目的是开发一个基因表达变异模型(GEVM),利用变异系数(CV)和平均表达水平之间的关系来解决单细胞数据的过度分散及其相应的统计学意义。以量化可变表达的基因(VEG)。结果我们建立了一个模拟框架,该框架生成了具有不同细胞数,模型参数和变异水平的scRNA-seq数据。我们实施了GEVM,并通过在不同条件下使用一组模拟的scRNA-seq数据证明了其鲁棒性。我们使用均方根误差(RMSE)评估了回归稳健性,并通过更改偏离同质细胞群体的初始模型参数来评估参数估计过程。我们还将GEVM应用于实际的scRNA-seq数据,以测试不同情况下的性能。结论在本文中,我们提出了一种基因表达变异模型,可用于确定重要的可变表达基因。将模型应用于模拟的单细胞数据,我们观察到在不同条件下具有最小均方根误差的鲁棒参数估计。我们还使用不同的单细胞方案在两个不同的scRNA-seq数据集上检查了模型,并确定了VEG。获得VEGs使我们能够观察可能的亚群,为细胞异质性提供了进一步的证据。使用GEVM,我们可以轻松地在不同的scRNA-seq数据集中找到重要的可变表达基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号