首页> 美国卫生研究院文献>other >De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from Total DNA Sequences
【2h】

De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from Total DNA Sequences

机译:从非模式物种的完整叶绿体基因组的从头组装基于总DNA序列的基于K-mer频率的叶绿体读数的选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Whole Genome Shotgun (WGS) sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This re-sequencing approach may select against structural differences between the genomes especially in non-model species for which no close relatives have been sequenced before. The alternative approach is to de novo assemble the chloroplast genome from total genomic DNA sequences. In this study, we used k-mer frequency tables to identify and extract the chloroplast reads from the WGS reads and assemble these using a highly integrated and automated custom pipeline. Our strategy includes steps aimed at optimizing assemblies and filling gaps which are left due to coverage variation in the WGS dataset. We have successfully de novo assembled three complete chloroplast genomes from plant species with a range of nuclear genome sizes to demonstrate the universality of our approach: Solanum lycopersicum (0.9 Gb), Aegilops tauschii (4 Gb) and Paphiopedilum henryanum (25 Gb). We also highlight the need to optimize the choice of k and the amount of data used. This new and cost-effective method for de novo short read assembly will facilitate the study of complete chloroplast genomes with more accurate analyses and inferences, especially in non-model plant genomes.
机译:植物物种的全基因组Shot弹枪(WGS)序列通常包含大量来自叶绿体基因组的读数。迄今为止,基于与来自相关物种的叶绿体的同源性,通常已经鉴定出这些读段并组装到叶绿体基因组中。这种重新测序方法可以针对基因组之间的结构差异进行选择,尤其是在以前没有近亲进行过测序的非模型物种中。另一种方法是从总基因组DNA序列重新组装叶绿体基因组。在这项研究中,我们使用k-mer频率表从WGS读数中识别和提取叶绿体读数,并使用高度集成和自动化的定制管道将它们组装起来。我们的策略包括旨在优化装配和填补WGS数据集覆盖范围变化而留下的空白的步骤。我们已经从头成功地从植物物种中组装了三个完整的叶绿体基因组,这些基因组具有一系列核基因组大小,以证明我们方法的通用性:茄果茄(0.9 Gb),牛肝菌(4 Gb)和兜兰henryanum(25 Gb)。我们还强调需要优化k的选择和使用的数据量。这种从头开始的短读组装的经济有效的方法将有助于以更准确的分析和推论来研究完整的叶绿体基因组,尤其是在非模型植物基因组中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号