首页> 外文期刊>Progress in Artificial Intelligence >GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms
【24h】

GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms

机译:GC偏见影响基因组和偏见的重建,不足的GC-贫民生物体

获取原文
获取原文并翻译 | 示例
           

摘要

Background: Metagenomic sequencing is a well-established tool in the modern biosciences. While it promises unparalleled insights into the genetic content of the biological samples studied, conclusions drawn are at risk from biases inherent to the DNA sequencing methods, including inaccurate abundance estimates as a function of genomic guanine-cytosine (GC) contents. Results: We explored such GC biases across many commonly used platforms in experiments sequencing multiple genomes (with mean GC contents ranging from 28.9% to 62.4%) and metagenomes. GC bias profiles varied among different library preparation protocols and sequencing platforms. We found that our workflows using MiSeq and NextSeq were hindered by major GC biases, with problems becoming increasingly severe outside the 45-65% GC range, leading to a falsely low coverage in GC-rich and especially GC-poor sequences, where genomic windows with 30% GC content had >10-fold less coverage than windows close to 50% GC content. We also showed that GC content correlates tightly with coverage biases. The PacBio and HiSeq platforms also evidenced similar profiles of GC biases to each other, which were distinct from those seen in the MiSeq and NextSeq workflows. The Oxford Nanopore workflow was not afflicted by GC bias. Conclusions: These findings indicate potential sources of difficulty, arising from GC biases, in genome sequencing that could be pre-emptively addressed with methodological optimizations provided that the GC biases inherent to the relevant workflow are understood. Furthermore, it is recommended that a more critical approach be taken in quantitative abundance estimates in metagenomic studies. In the future, metagenomic studies should take steps to account for the effects of GC bias before drawing conclusions, or they should use a demonstrably unbiased workflow.
机译:背景:Metagenomic测序是现代生物科学的良好工具。虽然它承诺对所研究的生物样品的遗传含量有着无与伦比的见解,但得出的结论受到DNA测序方法固有的偏差的风险,包括作为基因组鸟嘌呤 - 胞嘧啶(GC)含量的不准确的丰度估计。结果:我们在实验中探讨了许多常用平台上的这种GC偏差在测序多个基因组(平均GC含量范围为28.9%至62.4%)和Metagenomes。 GC偏置配置文件在不同的图书馆准备协议和排序平台之间变化。我们发现,我们使用Miseq和Nextseq的工作流量受到GC偏见的影响,问题越来越严重,在45-65%的GC范围内,导致GC富含GC和尤其是GC差的序列中的错误覆盖范围,基因组窗口对于30%的GC含量,覆盖率比Windows少于50%的GC含量。我们还表明,GC含量与覆盖率偏差紧密相关。 PACBIO和HISEQ平台也显着的GC偏差彼此的类似配置,这与Miseq和Nextseq工作流中看到的那些不同。牛津纳米孔的工作流程没有GC偏压造成折磨。结论:这些发现表明,在可以用方法优化的基因组测序中,从GC偏置引起的潜在难度来源,即可以通过方法优化来预先清空地解决,条件是所固有的相关工作流固有的GC偏差。此外,建议采取更批判的方法在雌噬菌素研究中的定量丰度估计中。在未来,偏见的研究应该采取措施来绘制GC偏差在得出结论之前的影响,或者它们应该使用明显的无偏的工作流程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号