...
首页> 外文期刊>BMC Genomics >Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data
【24h】

Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data

机译:使用短读RNA-Seq数据优化从头开始的普通小麦转录组装配

获取原文
           

摘要

Background Rapid advances in next-generation sequencing methods have provided new opportunities for transcriptome sequencing (RNA-Seq). The unprecedented sequencing depth provided by RNA-Seq makes it a powerful and cost-efficient method for transcriptome study, and it has been widely used in model organisms and non-model organisms to identify and quantify RNA. For non-model organisms lacking well-defined genomes, de novo assembly is typically required for downstream RNA-Seq analyses, including SNP discovery and identification of genes differentially expressed by phenotypes. Although RNA-Seq has been successfully used to sequence many non-model organisms, the results of de novo assembly from short reads can still be improved by using recent bioinformatic developments. Results In this study, we used 212.6 million pair-end reads, which accounted for 16.2 Gb, to assemble the hexaploid wheat transcriptome. Two state-of-the-art assemblers, Trinity and Trans-ABySS, which use the single and multiple k-mer methods, respectively, were used, and the whole de novo assembly process was divided into the following four steps: pre-assembly, merging different samples, removal of redundancy and scaffolding. We documented every detail of these steps and how these steps influenced assembly performance to gain insight into transcriptome assembly from short reads. After optimization, the assembled transcripts were comparable to Sanger-derived ESTs in terms of both continuity and accuracy. We also provided considerable new wheat transcript data to the community. Conclusions It is feasible to assemble the hexaploid wheat transcriptome from short reads. Special attention should be paid to dealing with multiple samples to balance the spectrum of expression levels and redundancy. To obtain an accurate overview of RNA profiling, removal of redundancy may be crucial in de novo assembly.
机译:背景技术下一代测序方法的飞速发展为转录组测序(RNA-Seq)提供了新的机会。 RNA-Seq提供了前所未有的测序深度,使其成为一种强大而经济高效的转录组研究方法,并且已广泛用于模型生物和非模型生物中,以鉴定和定量RNA。对于缺乏明确基因组的非模式生物,下游RNA-Seq分析通常需要从头组装,包括SNP发现和鉴定表型差异表达的基因。尽管RNA-Seq已成功用于许多非模式生物的测序,但通过使用最新的生物信息学进展,短读从头组装的结果仍然可以得到改善。结果在这项研究中,我们使用了2.126亿对双末端读段(占16.2 Gb)来组装六倍体小麦转录组。使用了两个最先进的组装程序Trinity和Trans-ABySS,它们分别使用单个和多个k-mer方法,并且从头开始的整个组装过程分为以下四个步骤:预组装,合并不同的样本,删除冗余和脚手架。我们记录了这些步骤的每个细节,以及这些步骤如何影响组装性能,以使您从简短的阅读中了解转录组组装。经过优化后,就连续性和准确性而言,组装的转录本与Sanger衍生的EST相当。我们还向社区提供了大量新的小麦成绩单数据。结论从短阅读中组装六倍体小麦转录组是可行的。应特别注意处理多个样品,以平衡表达水平和冗余度的范围。为了获得RNA配置文件的准确概述,冗余的去除对于从头组装可能至关重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号