首页> 美国卫生研究院文献>G3: GenesGenomesGenetics >De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis)
【2h】

De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis)

机译:加拿大海狸(Castor canadensis)的从头基因组和转录组组装

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology.
机译:加拿大海狸(Castor canadensis)是北美最大的本土啮齿动物。我们报告了海狸基因组的带注释的装配草案,第一个用于大型啮齿动物,第一个哺乳动物基因组直接从单分子测序产生的未经校正和中等覆盖率(<30×)的长读段直接组装而成。通过k-mer分析估计的基因组大小为2.7 Gb。我们使用针对噪音读取而优化的新型Canu汇编器组装了海狸基因组。使用支持短读(80×)的Pilon精制所得组件,并通过与独立的短读组件的一致性检查其准确性。我们使用从9805个全长阅读框(FL-ORF)衍生的外显子-基因模型来构建装配体,该阅读框由海狸白细胞和肌肉转录组构建。最终组装包含22,515个重叠群,N50为278,680 bp,N50骨架为317,558 bp。重叠群和支架的最大长度分别为3.3 Mb和4.2 Mb,结合的支架长度占估计的基因组大小的92%。用于9805%组装的FL-ORFs的91.1%和用于评估基因组组装质量的BUSCO(基准通用单拷贝直系同源物)基因组的83.1%的精确外显子位置证明了支架组装的完整性和准确性。代表性的基因是参与牙列和牙釉质沉积的基因,这些基因定义了拥有海狸的啮齿动物的特征。该研究为基因组组装提供了见识,并为蓖麻和啮齿动物进化生物学提供了重要的基因组学资源。

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号