首页> 美国卫生研究院文献>PLoS Clinical Trials >CaSTLe – Classification of single cells by transfer learning: Harnessing the power of publicly available single cell RNA sequencing experiments to annotate new experiments
【2h】

CaSTLe – Classification of single cells by transfer learning: Harnessing the power of publicly available single cell RNA sequencing experiments to annotate new experiments

机译:CaSTLe –通过转移学习对单细胞进行分类:利用公开可用的单细胞RNA测序实验对新实验进行注释的能力

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Single-cell RNA sequencing (scRNA-seq) is an emerging technology for profiling the gene expression of thousands of cells at the single cell resolution. Currently, the labeling of cells in an scRNA-seq dataset is performed by manually characterizing clusters of cells or by fluorescence-activated cell sorting (FACS). Both methods have inherent drawbacks: The first depends on the clustering algorithm used and the knowledge and arbitrary decisions of the annotator, and the second involves an experimental step in addition to the sequencing and cannot be incorporated into the higher throughput scRNA-seq methods. We therefore suggest a different approach for cell labeling, namely, classifying cells from scRNA-seq datasets by using a model transferred from different (previously labeled) datasets. This approach can complement existing methods, and–in some cases–even replace them. Such a transfer-learning framework requires selecting informative features and training a classifier. The specific implementation for the framework that we propose, designated ''CaSTLe–classification of single cells by transfer learning,'' is based on a robust feature engineering workflow and an XGBoost classification model built on these features. Evaluation of CaSTLe against two benchmark feature-selection and classification methods showed that it outperformed the benchmark methods in most cases and yielded satisfactory classification accuracy in a consistent manner. CaSTLe has the additional advantage of being parallelizable and well suited to large datasets. We showed that it was possible to classify cell types using transfer learning, even when the databases contained a very small number of genes, and our study thus indicates the potential applicability of this approach for analysis of scRNA-seq datasets.
机译:单细胞RNA测序(scRNA-seq)是一种新兴技术,用于以单细胞分辨率分析成千上万个细胞的基因表达。目前,通过手动表征细胞簇或通过荧光激活细胞分选(FACS)对scRNA-seq数据集中的细胞进行标记。两种方法都有其固有的缺点:第一种方法取决于所使用的聚类算法以及注释者的知识和任意决定,第二种方法除了测序之外还涉及一个实验步骤,因此无法纳入更高通量的scRNA-seq方法中。因此,我们提出了一种不同的细胞标记方法,即通过使用从不同(先前标记)数据集中转移的模型对来自scRNA-seq数据集的细胞进行分类。这种方法可以补充现有方法,在某些情况下甚至可以替代它们。这样的转移学习框架需要选择信息功能并训练分类器。我们建议的框架的特定实现称为“ CaSTLe –通过转移学习对单个单元格进行分类”,基于强大的功能工程工作流和基于这些功能的XGBoost分类模型。针对两种基准特征选择和分类方法对CaSTLe进行的评估表明,在大多数情况下,CaSTLe的性能优于基准方法,并且以一致的方式产生令人满意的分类精度。 CaSTLe的另一个优势是可并行化并非常适合大型数据集。我们表明,即使数据库中包含的基因数量很少,也可以使用转移学习对细胞类型进行分类,因此我们的研究表明了这种方法在分析scRNA-seq数据集方面的潜在适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号