首页> 外文期刊>Annals Data Science >Using Maximum Subarrays for Approximate String Matching
【24h】

Using Maximum Subarrays for Approximate String Matching

机译:使用最大子数组进行近似字符串匹配

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we evaluate maximum subarrays for approximate string matching and alignment. The global alignment score as well as local sub-alignments are indicators of good alignment. After showing how maximum sub-arrays could be used for string matching, we provide several ways of using maximum subarrays: long, short, loose, strict, and top-k. While long version extends the local sub-alignments, the short method avoids extensions that would not increase the alignment score. The loose method tries to achieve high global score whereas the strict method converts the output of loose alignment by minimizing the unnecessary gaps. The top-k method is used to find out top-k sub-alignments. The results are compared with two global and local dynamic programming methods that use gap penalties in addition to one of the state-of-art methods. In our experiments, using maximum subarrays generated good overall as well as local sub-alignments without requiring gap penalties.
机译:在本文中,我们评估了最大子数组的近似字符串匹配和对齐方式。整体比对得分以及局部子比对是良好比对的指标。在展示了如何将最大子数组用于字符串匹配之后,我们提供了几种使用最大子数组的方法:长,短,松散,严格和top-k。虽然长版本扩展了局部子对齐方式,但short方法避免了不会增加对齐分数的扩展。松散方法尝试获得较高的全局分数,而严格方法则通过最小化不必要的间隙来转换松散对齐的输出。 top-k方法用于找出top-k子路线。将结果与两种全球和局部动态编程方法进行比较,该方法除了使用一种最先进的方法外,还使用空位罚分。在我们的实验中,使用最大的子阵列可以产生良好的整体以及局部子序列,而无需进行间隔罚分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号