首页> 美国卫生研究院文献>other >A Rank-Based Sequence Aligner with Applications in Phylogenetic Analysis
【2h】

A Rank-Based Sequence Aligner with Applications in Phylogenetic Analysis

机译:基于秩的序列比对技术及其在系统发育分析中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recent tools for aligning short DNA reads have been designed to optimize the trade-off between correctness and speed. This paper introduces a method for assigning a set of short DNA reads to a reference genome, under Local Rank Distance (LRD). The rank-based aligner proposed in this work aims to improve correctness over speed. However, some indexing strategies to speed up the aligner are also investigated. The LRD aligner is improved in terms of speed by storing -mer positions in a hash table for each read. Another improvement, that produces an approximate LRD aligner, is to consider only the positions in the reference that are likely to represent a good positional match of the read. The proposed aligner is evaluated and compared to other state of the art alignment tools in several experiments. A set of experiments are conducted to determine the precision and the recall of the proposed aligner, in the presence of contaminated reads. In another set of experiments, the proposed aligner is used to find the order, the family, or the species of a new (or unknown) organism, given only a set of short Next-Generation Sequencing DNA reads. The empirical results show that the aligner proposed in this work is highly accurate from a biological point of view. Compared to the other evaluated tools, the LRD aligner has the important advantage of being very accurate even for a very low base coverage. Thus, the LRD aligner can be considered as a good alternative to standard alignment tools, especially when the accuracy of the aligner is of high importance. Source code and UNIX binaries of the aligner are freely available for future development and use at . The software is implemented in C++ and Java, being supported on UNIX and MS Windows.
机译:已设计出用于比对短DNA读数的最新工具,以优化正确性和速度之间的权衡。本文介绍了一种在本地秩距离(LRD)下为参考基因组分配一组短DNA读段的方法。这项工作中提出的基于等级的对齐器旨在提高速度的正确性。但是,还研究了一些加快对齐器速度的索引策略。通过在每次读取的哈希表中存储mer位置,可以提高LRD对齐器的速度。产生近似LRD对齐器的另一项改进是仅考虑参考中可能表示读取位置匹配良好的位置。在几个实验中,对提出的对准器进行了评估,并将其与其他先进的对准工具进行了比较。在存在受污染的读段的情况下,进行了一组实验以确定拟议的对准仪的精度和召回率。在另一组实验中,仅给出一组简短的新一代测序DNA读物,使用拟议的比对器来查找新(或未知)生物的顺序,家族或物种。实验结果表明,从生物学的角度来看,这项工作中提出的对准器是高度准确的。与其他评估工具相比,LRD对准器具有重要的优势,即使对于非常低的基本覆盖范围也非常精确。因此,LRD对准器可被视为标准对准工具的良好替代品,尤其是在对准器的精度非常重要的情况下。可以免费获取aligner的源代码和UNIX二进制文件,以用于将来的开发和使用。该软件以C ++和Java实现,在UNIX和MS Windows上受支持。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号