首页> 外文期刊>BMC Genomics >Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance
【24h】

Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance

机译:简短阅读非模型蜗牛物种转录组(基数Balthica,Basommatophora,Pulmonata)从头组装的Illumina数据,并比较组装性能

获取原文
           

摘要

Background Until recently, read lengths on the Solexa/Illumina system were too short to reliably assemble transcriptomes without a reference sequence, especially for non-model organisms. However, with read lengths up to 100 nucleotides available in the current version, an assembly without reference genome should be possible. For this study we created an EST data set for the common pond snail Radix balthica by Illumina sequencing of a normalized transcriptome. Performance of three different short read assemblers was compared with respect to: the number of contigs, their length, depth of coverage, their quality in various BLAST searches and the alignment to mitochondrial genes. Results A single sequencing run of a normalized RNA pool resulted in 16,923,850 paired end reads with median read length of 61 bases. The assemblies generated by VELVET , OASES , and SeqMan NGEN differed in the total number of contigs, contig length, the number and quality of gene hits obtained by BLAST searches against various databases, and contig performance in the mt genome comparison. While VELVET produced the highest overall number of contigs, a large fraction of these were of small size ( Conclusion Our results document the first de novo transcriptome assembly of a non-model species using Illumina sequencing data. We show that de novo transcriptome assembly using this approach yields results useful for downstream applications, in particular if a meta-assembly of contig sets is used to increase contig quality. These results highlight the ongoing need for improvements in assembly methodology.
机译:背景技术直到最近,Solexa / Illumina系统上的读取长度仍太短,无法可靠地组装没有参考序列的转录组,特别是对于非模式生物。但是,在当前版本中,读取长度最多可达到100个核苷酸,因此无需参考基因组的装配应该是可能的。对于本研究,我们通过归一化转录组的Illumina测序为普通池塘蜗牛Rad创建了EST数据集。比较了三种不同的短读汇编器的性能:重叠群的数量,长度,覆盖深度,在各种BLAST搜索中的质量以及与线粒体基因的比对。结果归一化RNA池的单次测序运行产生了16,923,850个配对的末端读取,中位读取长度为61个碱基。 VELVET,OASES和SeqMan NGEN生成的程序集在重叠群的总数,重叠群长度,通过BLAST搜索各种数据库获得的基因命中的数量和质量以及在mt基因组比较中重叠群的性能方面有所不同。虽然VELVET产生的重叠群总数最高,但是其中很大一部分是小片段(结论我们的结果使用Illumina测序数据记录了非模型物种的第一个从头转录组装配。我们显示了使用此序列的从头转录组装配这种方法产生的结果对下游应用很有用,特别是如果使用重叠群的元装配来提高重叠群质量时,这些结果突出表明了对改进装配方法的持续需求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号