首页> 外文期刊>BMC Bioinformatics >A robust nonlinear low-dimensional manifold for single cell RNA-seq data
【24h】

A robust nonlinear low-dimensional manifold for single cell RNA-seq data

机译:用于单个单元RNA-SEQ数据的鲁棒非线性低维歧管

获取原文
           

摘要

BACKGROUND:Modern developments in single-cell sequencing technologies enable broad insights into cellular state. Single-cell RNA sequencing (scRNA-seq) can be used to explore cell types, states, and developmental trajectories to broaden our understanding of cellular heterogeneity in tissues and organs. Analysis of these sparse, high-dimensional experimental results requires dimension reduction. Several methods have been developed to estimate low-dimensional embeddings for filtered and normalized single-cell data. However, methods have yet to be developed for unfiltered and unnormalized count data that estimate uncertainty in the low-dimensional space. We present a nonlinear latent variable model with robust, heavy-tailed error and adaptive kernel learning to estimate low-dimensional nonlinear structure in scRNA-seq data.RESULTS:Gene expression in a single cell is modeled as a noisy draw from a Gaussian process in high dimensions from low-dimensional latent positions. This model is called the Gaussian process latent variable model (GPLVM). We model residual errors with a heavy-tailed Student's t-distribution to estimate a manifold that is robust to technical and biological noise found in normalized scRNA-seq data. We compare our approach to common dimension reduction tools across a diverse set of scRNA-seq data sets to highlight our model's ability to enable important downstream tasks such as clustering, inferring cell developmental trajectories, and visualizing high throughput experiments on available experimental data.CONCLUSION:We show that our adaptive robust statistical approach to estimate a nonlinear manifold is well suited for raw, unfiltered gene counts from high-throughput sequencing technologies for visualization, exploration, and uncertainty estimation of cell states.
机译:背景:单细胞排序技术中的现代发展使得蜂窝状态具有广泛的见解。单细胞RNA测序(ScRNA-SEQ)可用于探索细胞类型,州和发育轨迹,以扩大我们对组织和器官中细胞异质性的理解。分析这些稀疏,高维实验结果需要尺寸减少。已经开发了几种方法来估计用于过滤和标准化的单细胞数据的低维嵌入。然而,尚未开发用于估计低维空间中不确定性的未过滤和非正式化计数数据的方法。我们提出了一种具有稳健,重尾误差和自适应核心学习的非线性潜在的变量模型,以估计ScrNA-SEQ数据中的低维非线性结构。结果:单个电池中的基因表达被建模为来自高斯过程的嘈杂抽取低维潜在位置的高尺寸。该模型称为高斯过程潜变量模型(GPLVM)。我们用重型学生的T分布模拟残留误差,以估计对归一化ScrNA-SEQ数据中的技术和生物噪声具有稳健的歧管。我们将我们的方法与多种SCRNA-SEQ数据集中的共同维度减少工具进行了比较,以突出我们的模型能够实现最重要的下游任务,例如集群,推断的细胞发育轨迹和可视化的可用实验数据的高吞吐量实验。结论:我们表明,我们的自适应稳健的统计方法来估计非线性歧管非常适合于来自高通量测序技术的原始的未过滤基因计数,以便可视化,探索和细胞状态的不确定性估计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号