首页> 外文OA文献 >Nanopore sequencing and full genome de novo assembly of human cytomegalovirus TB40/E reveals clonal diversity and structural variations
【2h】

Nanopore sequencing and full genome de novo assembly of human cytomegalovirus TB40/E reveals clonal diversity and structural variations

机译:人巨细胞病毒TB40 / E的纳米孔测序和全基因组组装揭示了克隆多样性和结构性变化

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Abstract Background Human cytomegalovirus (HCMV) has a double-stranded DNA genome of approximately 235 Kbp that is structurally complex including extended GC-rich repeated regions. Genomic recombination events are frequent in HCMV cultures but have also been observed in vivo. Thus, the assembly of HCMV whole genomes from technologies producing shorter than 500 bp sequences is technically challenging. Here we improved the reconstruction of HCMV full genomes by means of a hybrid, de novo genome-assembly bioinformatics pipeline upon data generated from the recently released MinION MkI B sequencer from Oxford Nanopore Technologies. Results The MinION run of the HCMV (strain TB40/E) library resulted in ~ 47,000 reads from a single R9 flowcell and in ~ 100× average read depth across the virus genome. We developed a novel, self-correcting bioinformatics algorithm to assemble the pooled HCMV genomes in three stages. In the first stage of the bioinformatics algorithm, long contigs (N50 = 21,892) of lower accuracy were reconstructed. In the second stage, short contigs (N50 = 5686) of higher accuracy were assembled, while in the final stage the high quality contigs served as template for the correction of the longer contigs resulting in a high-accuracy, full genome assembly (N50 = 41,056). We were able to reconstruct a single representative haplotype without employing any scaffolding steps. The majority (98.8%) of the genomic features from the reference strain were accurately annotated on this full genome construct. Our method also allowed the detection of multiple alternative sub-genomic fragments and non-canonical structures suggesting rearrangement events between the unique (UL /US) and the repeated (T/IRL/S) genomic regions. Conclusions Third generation high-throughput sequencing technologies can accurately reconstruct full-length HCMV genomes including their low-complexity and highly repetitive regions. Full-length HCMV genomes could prove crucial in understanding the genetic determinants and viral evolution underpinning drug resistance, virulence and pathogenesis.
机译:摘要背景人巨细胞病毒(HCMV)具有约235kbp的双链DNA基因组,其在结构上复杂,包括富含GC的重复区域。基因组重组事件在HCMV培养物中频繁频繁,但也在体内观察到。因此,来自生产短于500bp序列的技术的HCMV全基因组的组装在技术上是具有挑战性的。在这里,我们通过羟基·基因组组装生物信息化管道改善了从牛津纳米孔技术的最近释放的小型MKI B序列序列产生的数据中的杂交类杂交类杂交型副族动物组装生物信息化管道改善了HCMV全基因组。结果HCMV(菌株TB40 / e)文库的碎位运行导致从单个R9流动细胞读取〜47,000次,并在病毒基因组上读取〜100×平均读取深度。我们开发了一种新颖,自我纠正的生物信息学算法,可以在三个阶段组装汇集的HCMV基因组。在生物信息学算法的第一阶段,重建了较低精度的长凸曲(N50 = 21,892)。在第二阶段,组装了更高的精度的短折叠(N50 = 5686),而在最后阶段,高质量的Contigs作为模板用于校正较长的凸起,导致高精度,完全基因组组装(N50 = 41,056)。我们能够重建单个代表性单倍型而不采用任何脚手架步骤。来自参考菌株的大多数(98.8%)基因组特征在该全基因组结构上精确注释。我们的方法还允许检测多种替代的子基因组片段和非规范结构,暗示唯一(UL / US)和重复(T / IRR / S)基因组区域之间的重排事件。结论第三代高通量测序技术可以准确地重建全长HCMV基因组,包括其低复杂性和高度重复的区域。全长HCMV基因组可以证明对理解遗传决定因素和病毒进化的耐药性,毒力和发病机制是至关重要的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号