首页> 外文期刊>Nature >Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum
【24h】

Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum

机译:耐旱草气单胞菌的单分子测序

获取原文
获取原文并翻译 | 示例
       

摘要

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly(1). The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE)(2). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a 'near-complete' draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
机译:植物基因组和一般的真核基因组通常是重复的,多倍体的和杂合的,这使基因组组装变得复杂(1)。早期的Sanger和目前的下一代测序平台读取时间短,阻碍了通过复杂的重复区域进行组装,并且许多草稿和参考基因组是片段化的,缺乏偏斜的GC和重复的基因间序列,由于诸如DNA百科全书等项目的重要性,这些问题变得越来越重要元素(ENCODE)(2)。在这里,我们报告耐旱的草Oropetium thomaeum的全基因组测序和组装。仅使用单分子实时测序,该序列会产生长(大于16万碱基)的读数,并带有随机错误,我们将99%(244兆碱基)的Oropetium基因组组装到625个重叠群中,其N50长度为2.4兆碱基。 ro是“近乎完整”草案基因组的一个例子,其中包括基因空间上的无间隙覆盖以及基因组序列,如着丝粒,端粒,转座因子和通常在草案基因组中未组装的rRNA簇。 Oropetium有28,466个蛋白质编码基因和43%的重复序列,但紧密的常染色体区域却多了30%,它是已知的最小的草基因组。 Oropetium基因组证明了单分子实时测序在组装高质量植物和其他真核基因组方面的实用性,并为植物比较基因组学界提供了宝贵的资源。

著录项

  • 来源
    《Nature》 |2015年第7579期|508-511|共4页
  • 作者单位

    Donald Danforth Plant Sci Ctr, St Louis, MO 63132 USA;

    Donald Danforth Plant Sci Ctr, St Louis, MO 63132 USA;

    Michigan State Univ, Dept Hort, E Lansing, MI 48323 USA|Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA;

    Fujian Agr & Forestry Univ, HIST, Ctr Genom & Biotechnol, Fuzhou 350002, Peoples R China|Univ Arizona, Sch Plant Sci, IPlant Collaborat, Tucson, AZ 85721 USA;

    Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA;

    Univ Bonn, IMBIO, D-53115 Bonn, Germany;

    Pacific Biosci, Menlo Pk, CA 94025 USA;

    Pacific Biosci, Menlo Pk, CA 94025 USA;

    Pacific Biosci, Menlo Pk, CA 94025 USA;

    Univ Arizona, Sch Plant Sci, IPlant Collaborat, Tucson, AZ 85721 USA;

    Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA;

    Univ Bonn, IMBIO, D-53115 Bonn, Germany;

    BioNano Genom, San Diego, CA 92121 USA;

    BioNano Genom, San Diego, CA 92121 USA;

    Ibis Biosci, Carlsbad, CA 92008 USA;

    Donald Danforth Plant Sci Ctr, St Louis, MO 63132 USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-18 02:52:44

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号