首页> 外文期刊>BMC Genomics >A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly
【24h】

A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly

机译:非模型动物的比较表明,从头转录组装配的最佳测序深度

获取原文
           

摘要

Background The lack of genomic resources can present challenges for studies of non-model organisms. Transcriptome sequencing offers an attractive method to gather information about genes and gene expression without the need for a reference genome. However, it is unclear what sequencing depth is adequate to assemble the transcriptome de novo for these purposes. Results We assembled transcriptomes of animals from six different phyla (Annelids, Arthropods, Chordates, Cnidarians, Ctenophores, and Molluscs) at regular increments of reads using Velvet/Oases and Trinity to determine how read count affects the assembly. This included an assembly of mouse heart reads because we could compare those against the reference genome that is available. We found qualitative differences in the assemblies of whole-animals versus tissues. With increasing reads, whole-animal assemblies show rapid increase of transcripts and discovery of conserved genes, while single-tissue assemblies show a slower discovery of conserved genes though the assembled transcripts were often longer. A deeper examination of the mouse assemblies shows that with more reads, assembly errors become more frequent but such errors can be mitigated with more stringent assembly parameters. Conclusions These assembly trends suggest that representative assemblies are generated with as few as 20 million reads for tissue samples and 30 million reads for whole-animals for RNA-level coverage. These depths provide a good balance between coverage and noise. Beyond 60 million reads, the discovery of new genes is low and sequencing errors of highly-expressed genes are likely to accumulate. Finally, siphonophores (polymorphic Cnidarians) are an exception and possibly require alternate assembly strategies.
机译:背景技术缺乏基因组资源可能对非模式生物的研究提出挑战。转录组测序提供了一种有吸引力的方法,无需参考基因组即可收集有关基因和基因表达的信息。然而,尚不清楚什么测序深度足以组装用于这些目的的从头转录组。结果我们使用Velvet / Oases和Trinity定期以增量读取来自六个不同门(Annelids,节肢动物,Chordates,Cnidarians,Ctenophores和Molluscs)的动物的转录组,以确定读取计数如何影响装配。这包括老鼠心脏阅读的汇编,因为我们可以将它们与可用的参考基因组进行比较。我们发现整动物和组织的装配在质量上存在差异。随着阅读次数的增加,全动物装配体显示出转录本的快速增加和保守基因的发现,而单组织装配体显示出保守基因的发现更慢,尽管装配的转录本通常更长。对鼠标组件的更深入检查显示,随着读取次数的增加,组装错误会变得更加常见,但是可以通过使用更严格的组装参数来减轻此类错误。结论这些组装趋势表明,具有代表性的组装产生的组织样本少至2000万次读取,而整个动物的RNA水平覆盖则少至3000万次读取。这些深度在覆盖范围和噪声之间提供了良好的平衡。超过6000万次读取,发现新基因的几率低,并且高度表达的基因的测序错误可能会累积。最后,虹吸管(多形的刺胞)是一个例外,可能需要替代的组装策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号