首页> 外文期刊>Nature biotechnology >Detecting and correcting systematic variation in large-scale RNA sequencing data
【24h】

Detecting and correcting systematic variation in large-scale RNA sequencing data

机译:检测和纠正大规模RNA测序数据中的系统变异

获取原文
获取原文并翻译 | 示例
           

摘要

High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms or at multiple sites. Here we used standardized RNA samples with built-in controls to examine sources of error in large-scale RNA-seq studies and their impact on the detection of differentially expressed genes (DEGs). Analysis of variations in guanine-cytosine content, gene coverage, sequencing error rate and insert size allowed identification of decreased reproducibility across sites. Moreover, commonly used methods for normalization (cqn, EDASeq, RUV2, sva, PEER) varied in their ability to remove these systematic biases, depending on sample complexity and initial data quality. Normalization methods that combine data from genes across sites are strongly recommended to identify and remove site-specific effects and can substantially improve RNA-seq studies.
机译:高通量RNA测序(RNA-seq)可以对整个转录组进行全面扫描,但是尚未完全定义用于分析RNA-seq数据的最佳实践,尤其是对于使用多个测序平台或在多个位点收集的数据。在这里,我们使用带有内置控件的标准化RNA样本来检查大规模RNA-seq研究中的错误来源及其对差异表达基因(DEGs)检测的影响。通过分析鸟嘌呤-胞嘧啶含量,基因覆盖率,测序错误率和插入片段大小的变化,可以鉴定出跨位点重复性降低。此外,取决于样本的复杂性和初始数据质量,常用的标准化方法(cqn,EDASeq,RUV2,sva,PEER)在消除这些系统偏差方面的能力也有所不同。强烈建议使用标准化方法,结合跨站点基因的数据来识别和消除站点特异性效应,并且可以大大改善RNA序列研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号