首页> 外文期刊>Bioinformatics >slaMEM: efficient retrieval of maximal exact matches using a sampled LCP array
【24h】

slaMEM: efficient retrieval of maximal exact matches using a sampled LCP array

机译:slaMEM:使用采样的LCP阵列高效检索最大精确匹配

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Maximal exact matches, or just MEMs, are a powerful tool in the context of multiple sequence alignment and approximate string matching. The most efficient algorithms to collect them are based on compressed indexes that rely on longest common prefix array-centered data structures. However, their space-efficient representations make use of encoding techniques that are expensive from a computational point of view. With the deluge of data generated by high-throughput sequencing, new approaches need to be developed to deal with larger genomic sequences. Results: In this work, we have developed a new longest common prefix array-sampled representation, optimized to work with the backward search method inherently used by the FM-Index. Unlike previous implementations that sacrifice running time to have smaller space, ours lead to both a fast and a space-efficient approach. This implementation was used by the new software slaMEM, developed to efficiently retrieve MEMs. The results show that the new algorithm is competitive against existing state-of-the-art approaches.
机译:动机:在多个序列比对和近似字符串匹配的情况下,最大精确匹配或仅是MEM是强大的工具。收集它们的最有效算法是基于压缩索引的,该索引依赖于最长的以公共前缀数组为中心的数据结构。然而,它们的节省空间的表示使用了从计算角度来看昂贵的编码技术。随着高通量测序产生的大量数据,需要开发新方法来处理更大的基因组序列。结果:在这项工作中,我们开发了一种新的最长的公共前缀数组采样表示,经过优化可与FM-Index固有使用的向后搜索方法一起使用。与先前的牺牲运行时间以减小空间的实现不同,我们的方法导致了一种快速且节省空间的方法。新软件slaMEM使用了此实现,该软件开发用于有效地检索MEM。结果表明,新算法与现有的最新方法相比具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号