首页> 外文期刊>Bioinformatics >COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly
【24h】

COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly

机译:COPE:一种精确的基于k-mer的双末端读取连接工具,可促进基因组组装

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: The boost of next-generation sequencing technologies provides us with an unprecedented opportunity for elucidating genetic mysteries, yet the short-read length hinders us from better assembling the genome from scratch. New protocols now exist that can generate overlapping pair-end reads. By joining the 3′ ends of each read pair, one is able to construct longer reads for assembling. However, effectively joining two overlapped pair-end reads remains a challenging task. Result: In this article, we present an efficient tool called Connecting Overlapped Pair-End (COPE) reads, to connect overlapping pair-end reads using k-mer frequencies. We evaluated our tool on 30× simulated pair-end reads from Arabidopsis thaliana with 1% base error. COPE connected over 99% of reads with 98.8% accuracy, which is, respectively, 10 and 2% higher than the recently published tool FLASH. When COPE is applied to real reads for genome assembly, the resulting contigs are found to have fewer errors and give a 14-fold improvement in the N50 measurement when compared with the contigs produced using unconnected reads.
机译:动机:下一代测序技术的发展为我们提供了阐明遗传奥秘的空前机会,但是短读长度阻碍了我们从头开始更好地组装基因组。现在存在可以生成重叠的双端读的新协议。通过连接每个读对的3'端,一个人能够构建更长的读链以进行组装。然而,有效地连接两个重叠的配对末端读取仍然是一项艰巨的任务。结果:在本文中,我们提供了一个称为“连接重叠对端(COPE)读数”的有效工具,可以使用k-mer频率连接重叠对对端读数。我们对拟南芥的30倍模拟对末端读数(基础误差为1%)进行了评估。 COPE连接了超过99%的读取,准确性为98.8%,分别比最近发布的工具FLASH高10%和2%。当将COPE应用于基因组装配的真实读段时,与使用未连接的读段产生的重叠群相比,发现所得的重叠群具有更少的错误,并且在N50测量中提高了14倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号