首页> 外文期刊>IEEE Transactions on Information Theory >Levenshtein Distance, Sequence Comparison and Biological Database Search
【24h】

Levenshtein Distance, Sequence Comparison and Biological Database Search

机译:Levenshtein距离,序列比较和生物数据库搜索

获取原文
获取原文并翻译 | 示例
           

摘要

Levenshtein edit distance has played a central role-both past and present-in sequence alignment in particular and biological database similarity search in general. We start our review with a history of dynamic programming algorithms for computing Levenshtein distance and sequence alignments. Following, we describe how those algorithms led to heuristics employed in the most widely used software in bioinformatics, BLAST, a program to search DNA and protein databases for evolutionarily relevant similarities. More recently, the advent of modern genomic sequencing and the volume of data it generates has resulted in a return to the problem of local alignment. We conclude with how the mathematical formulation of Levenshtein distance as a metric made possible additional optimizations to similarity search in biological contexts. These modern optimizations are built around the low metric entropy and fractional dimensionality of biological databases, enabling orders of magnitude acceleration of biological similarity search.
机译:Levenshtein编辑距离已经播放了一个核心角色 - 尤其是过去的序列对齐和生物数据库相似性搜索。我们使用动态编程算法的历史记录来计算用于计算Levenshtein距离和序列对齐的历史记录。以下,我们描述了这些算法如何导致在生物信息学中最广泛使用的软件中采用的启发式,爆炸,用于搜索DNA和蛋白质数据库的程序以进行进化相关的相似之处。最近,现代基因组测序的出现和它生成的数据量导致返回局部对齐问题。我们与度量的数学制定作为度量的数学制定如何在生物背景下对相似性搜索产生额外的优化。这些现代优化围绕生物数据库的低度量熵和分数维度,使生物相似性搜索的数量级加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号