首页> 外文期刊>Briefings in bioinformatics >Machine learning and statistical methods for clustering single-cell RNA-sequencing data
【24h】

Machine learning and statistical methods for clustering single-cell RNA-sequencing data

机译:用于聚类单细胞RNA测序数据的机器学习和统计方法

获取原文
获取原文并翻译 | 示例
           

摘要

Single-cell RNAsequencing (scRNA-seq) technologies have enabled the large-scale whole-transcriptome profiling of each individual single cell in a cell population. A core analysis of the scRNA-seq transcriptome profiles is to cluster the single cells to reveal cell subtypes and infer cell lineages based on the relations among the cells. This article reviews the machine learning and statistical methods for clustering scRNA-seq transcriptomes developed in the past few years. The review focuses on how conventional clustering techniques such as hierarchical clustering, graph-based clustering, mixture models, k-means, ensemble learning, neural networks and density-based clustering are modified or customized to tackle the unique challenges in scRNA-seq data analysis, such as the dropout of low-expression genes, low and uneven read coverage of transcripts, highly variable total mRNAs from single cells and ambiguous cell markers in the presence of technical biases and irrelevant confounding biological variations.We review how cell-specific normalization, the imputation of dropouts and dimension reduction methods can be applied with new statistical or optimization strategies to improve the clustering of single cells.We will also introduce those more advanced approaches to cluster scRNA-seq transcriptomes in time series data and multiple cell populations and to detect rare cell types. Several software packages developed to support the cluster analysis of scRNA-seq data are also reviewed and experimentally compared to evaluate their performance and efficiency. Finally, we conclude with useful observations and possible future directions in scRNA-seq data analytics. Availability: All the source code and data are available at https://github.com/kuanglab/single-cell-review.
机译:单细胞RNASEQUENCING(SCRNA-SEQ)技术使每个单独的单个细胞的大规模全转录组分析在细胞群中。 ScRNA-SEQ转录组简谱的核心分析是基于细胞之间的关系聚集单个细胞以揭示细胞亚型和推断细胞谱系。本文审查了在过去几年中开发的聚类Scrna-SEQ转录om的机器学习和统计方法。该审查重点是如何修改或定制基于分层聚类,基于图形聚类,基于图形的聚类,混合模型,K-Meance,集合,神经网络和密度的聚类的常规聚类技术如何进行修改或定制以解决ScrNA-SEQ数据分析中的独特挑战(例如低表达基因的差差,低且不均匀的转录物,来自单细胞的高度可变的总mRNA和含糊不调的细胞标记,在存在技术偏见和无关的混淆生物变化中。我们回顾了特定于细胞特异性的标准化,辍学和尺寸减少方法的归纳可以应用新的统计或优化策略,以改善单个细胞的聚类。我们还将介绍那些更高级的方法在时间序列数据和多个细胞群体中的聚类ScrNA-SEQ转录om和多个细胞群体和检测稀有细胞类型。为了评估它们的性能和效率,还讨论和实验也介绍了用于支持SCRNA-SEQ数据集群分析的若干软件包。最后,我们在Scrna-SEQ数据分析中使用有用的观察和可能的未来方向得出结论。可用性:所有源代码和数据都可以在https://github.com/kuanglab/single-cell-review中获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号