首页> 外文期刊>GigaScience >Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore
【24h】

Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore

机译:关于基因组组装的两个最新测序技术的比较:Hifi读取太平洋生物综合II系统和牛津纳米孔的超响读

获取原文
           

摘要

Abstract Background The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers—Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)—have recently updated their platforms: PacBio enables high-throughput HiFi reads with base-level resolution of 99%, and ONT generated reads as long as 2 Mb. We applied the 2 up-to-date platforms to a single rice individual and then compared the 2 assemblies to investigate the advantages and limitations of each. Results The results showed that ONT ultralong reads delivered higher contiguity, producing a total of 18 contigs of which 10 were assembled into a single chromosome compared to 394 contigs and 3 chromosome-level contigs for the PacBio assembly. The ONT ultralong reads also prevented assembly errors caused by long repetitive regions, for which we observed a total of 44 genes of false redundancies and 10 genes of false losses in the PacBio assembly, leading to over- or underestimation of the gene families in those long repetitive regions. We also noted that the PacBio HiFi reads generated assemblies with considerably fewer errors at the level of single nucleotides and small insertions and deletions than those of the ONT assembly, which generated an average 1.06 errors per kb and finally engendered 1,475 incorrect gene annotations via altered or truncated protein predictions. Conclusions It shows that both PacBio HiFi reads and ONT ultralong reads had their own merits. Further genome reference constructions could leverage both techniques to lessen the impact of assembly errors and subsequent annotation mistakes rooted in each.
机译:摘要背景参考基因组的可用性已彻底改变了生物学研究。已经制定了多种竞争技术以提高过去十年内基因组大会的质量和稳健性。这2个广泛使用的长读搜索提供商 - 太平洋生物科学(PACBIO)和牛津纳米孔技术(ONT) - 空洞于其平台:PACBIO使高吞吐量HIFI具有> 99%的基础分辨率,而ONT生成的读数只要2 MB。我们将2个最新平台应用于单个米色,然后比较了2个组件来研究每个大会的优点和限制。结果结果表明,与PACBIO组件的394个葡萄球菌和3颗染色体级别为PACBIO组件的394个Contigs和3个染色体级别,ONT超响读数达到较高的恒星,其总共18种聚集体,其中10种染色体组装成单一染色体。 ONT Ultralong读数还防止了长期重复区域引起的组装误差,我们在PACBIO组件中观察到共有44种虚假冗余和10个错误损失基因,导致在那些长期内过度或低估了基因家族重复地区。我们还指出,PACBIO HIFI读取产生的组件,在单核苷酸水平和比ONT组件的小核苷酸和缺失水平上读出具有相当较少的误差,其每kB的平均1.06误差产生,并且最终通过改变或改变了1,475个不正确的基因注释。截短的蛋白质预测。结论它表明,PACBIO HIFI读取和ONT Ultralong读取有自己的优点。进一步的基因组参考结构可以利用这两种技术来减少组装误差的影响,随后的注释错误根。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号