首页> 美国卫生研究院文献>other >Detecting and correcting systematic variation in large-scale RNA sequencing data
【2h】

Detecting and correcting systematic variation in large-scale RNA sequencing data

机译:检测和纠正大规模RNA测序数据中的系统变异

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms or at multiple sites. Here we used standardized RNA samples with built-in controls to examine sources of error in large-scale RNA-seq studies and their impact on the detection of differentially expressed genes (DEGs). Analysis of variations in guanine-cytosine content, gene coverage, sequencing error rate and insert size allowed identification of methods that produce more false positives or are less reproducible across sites. Moreover, commonly used methods fornormalization (cqn, EDASeq, RUV2, sva, PEER) varied in their ability to remove these systematic biases, depending on sample complexity and initial data quality. Normalization methods that combine data from genes across sites are strongly recommended to identify and remove site-specific effects, and can substantially improve RNA-seq studies.
机译:高通量RNA测序(RNA-seq)可以对整个转录组进行全面扫描,但是尚未完全定义用于分析RNA-seq数据的最佳实践,特别是对于使用多个测序平台或在多个站点收集的数据。在这里,我们使用带有内置控件的标准化RNA样本来检查大规模RNA-seq研究中的错误来源及其对差异表达基因(DEG)检测的影响。通过分析鸟嘌呤-胞嘧啶含量,基因覆盖率,测序错误率和插入片段大小的变异,可以鉴定出产生更多假阳性或在各个部位重复性较低的方法。此外,取决于样本的复杂性和初始数据质量,常用的标准化方法(cqn,EDASeq,RUV2,sva,PEER)在消除这些系统偏差方面的能力也各不相同。强烈建议将跨站点基因数据组合在一起的归一化方法,以识别和消除站点特异性效应,并可以大大改善RNA序列研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号