...
首页> 外文期刊>PLoS Genetics >De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis
【24h】

De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

机译:从头组装的40 Mb真核基因组的短序列读取: Sordaria macrospora ,一种真菌形态发生的模型生物

获取原文

摘要

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ~4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa . Comparison with genomes of other fungi showed that S. macrospora , a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in selfonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa . Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology. Author Summary Fungi have immense impacts on ecosystems and affect many aspects of society. They are used as convenient organisms for fundamental research because their typically haploid genetics enable straightforward phenotyping of mutations and because most fungal cells can differentiate the entire organism. Fungi have compact genomes with few repetitive sequences, and their genomes should be much easier to assemble from short sequence reads than genomes of mammals or higher plants. To test this idea, we used Solexa and 454 sequencing to generate ~4 Gb of raw sequence data from the filamentous fungus Sordaria macrospora . De novo assembly yielded 5,097 contigs. This assembly was improved by comparison with reference genomes of three closely related Neurospora species, resulting in placement of ~40 Mb of genome sequence in 152 scaffolds. From comparisons of predicted proteins we conclude that S. macrospora carries a conserved set of genes for signaling and development, which should encourage its further use as a model organism for morphogenesis and meiosis. We demonstrate that de novo assembly of fungal genomes from short reads is cheap and efficient. Species that are not traditionally considered “model organisms” but await genome sequencing for comparative and functional genomics analyses are at last amenable to in-depth genome-wide analyses.
机译:丝状真菌在生态,农业,医学和生物技术中非常重要。因此,毫不奇怪的是,已经对100多种丝状真菌的基因组进行了测序,其中大多数通过Sanger测序。新一代测序技术彻底改变了基因组重测序技术,例如对于菌株比较,遗传作图或转录组和ChIP分析,真核基因组的从头组装仍然存在很大的障碍,因为它们的大小很大且具有重复序列。丝状真菌在其30–90 Mb基因组中几乎没有重复区域,因此适合从短序列读取中测试从头开始的基因组组装。在这里,我们介绍了通过结合Illumina / Solexa和Roche / 454测序获得的Sordaria macrospora基因组的高质量草图序列。基因组DNA的双端Solexa测序覆盖率达到85倍,单端454测序覆盖率达到10倍,DNA序列约为4 Gb。使用Velvet汇编器将读段汇编为40 Mb的草稿版本(N50为117 kb)。用Neurospora基因组进行的比较分析将N50增加到498 kb。大孢链霉菌基因组与其最接近的序列亲缘神经孢霉(Neurospora crassa)相比包含更少的重复区域。与其他真菌基因组的比较表明,大孢链霉菌是形态发生和减数分裂的模型生物,具有与自我/非自我识别有关的几个基因的重复。此外,大孢链霉菌比猪笼草含有更多的聚酮化合物生物合成基因。系统发育分析表明,其中一些基因可能是通过水平基因转移从远缘的子囊菌群中获得的。我们的研究表明,对于典型的丝状真菌,仅从短序列读取中进行基因组从头组装是可行的,Solexa和454测序的混合物可显着改善组装,并且所得数据可用于比较研究以解决基础问题。真菌生物学问题。作者摘要真菌对生态系统具有巨大影响,并影响社会的许多方面。它们被用作基础研究的方便生物,因为它们通常的单倍体遗传学使突变的表型变得简单,并且大多数真菌细胞都可以分化整个生物。真菌具有紧凑的基因组,几乎没有重复序列,与短链哺乳动物或高等植物的基因组相比,从短序列阅读中组装它们的基因组应该容易得多。为了验证这一想法,我们使用Solexa和454测序从丝状真菌Sordaria macrospora产生约4 Gb的原始序列数据。从头大会产生了5,097个重叠群。通过与三个紧密相关的Neurospora物种的参考基因组进行比较,改进了该装配,从而在152个支架中放置了约40 Mb的基因组序列。通过比较预测的蛋白质,我们得出结论,大孢链霉菌携带一组保守的信号传导和发育基因,这应鼓励其进一步用作形态发生和减数分裂的模型生物。我们证明从短读的真菌基因组从头组装是便宜和有效的。传统上不被认为是“模型生物”但需要进行基因组测序以进行比较和功能基因组学分析的物种最后适合进行全基因组范围的深入分析。

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号