首页> 外文会议>International conference on intelligent computing >EnsembleKQC: An Unsupervised Ensemble Learning Method for Quality Control of Single Cell RNA-seq Sequencing Data
【24h】

EnsembleKQC: An Unsupervised Ensemble Learning Method for Quality Control of Single Cell RNA-seq Sequencing Data

机译:EnsembleKQC:单细胞RNA序列数据质量控制的无监督集成学习方法

获取原文

摘要

Single cell RNA sequencing (scRNA-seq) provides a view of high-resolution to reveal the cellular heterogenicity. A series of analysis, such as cell-type identification, differential expression analysis, regulatory relationship detection, could uncover unprecedented biological findings. Prior to these downstream analysis, it's crucial to remove low-quality cells because they are technical noises which weaken true biological signal and mislead downstream analysis. Existing methods either require setting threshold manually or require true labels for supervised training, which is not appropriate in many cases. We present an unsupervised ensemble learning method, which could automatically identify low-quality cells from single cell RNA-seq sequencing data. This method integrates weak classifiers base on five selected features from housekeeping genes, reads mapping rate and detected genes. To avoid setting thresholds of classifiers manually, it enumerates threshold values within a reasonable range and chooses the most suitable threshold values based on a scoring function. In experiments, it exhibits high and steady accuracy on multiple datasets.
机译:单细胞RNA测序(scRNA-seq)提供了高分辨率的视角,揭示了细胞异质性。细胞类型鉴定,差异表达分析,调控关系检测等一系列分析可能会发现前所未有的生物学发现。在进行这些下游分析之前,去除低质量的细胞至关重要,因为它们是技术噪声,会削弱真实的生物信号并误导下游分析。现有方法要么需要手动设置阈值,要么需要用于监督训练的真实标签,这在许多情况下是不合适的。我们提出了一种无监督的集成学习方法,该方法可以从单细胞RNA-seq测序数据中自动识别低质量的细胞。该方法基于从看家基因中选择的五个特征整合弱分类器,读取定位速率和检测到的基因。为了避免手动设置分类器的阈值,它会枚举合理范围内的阈值,并根据评分功能选择最合适的阈值。在实验中,它在多个数据集上显示出高而稳定的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号