首页> 外文会议>Automata, Languages and Programming >Permutation Editing and Matching via Embeddings
【24h】

Permutation Editing and Matching via Embeddings

机译:通过嵌入进行排列编辑和匹配

获取原文

摘要

If the genetic maps of two species are modelled as permutations of (homologous) genes, the number of chromosomal rearrangements in the form of deletions, block moves, inversions etc. to transform one such permutation to another can be used as a measure of their evolutionary distance. Motivated by such scenarios, we study problems of computing distances between permutations as well as matching permutations in sequences, and finding most similar permutation from a collection ("nearest neighbor"). We adopt a general approach: embed permutation distances of relevance into well-known vector spaces in an approximately distance-preserving manner, and solve the resulting problems on the well-known spaces. Our results are as follows: 1. We present the first known approximately distance preserving em-beddings of these permutation distances into well-known spaces. 2. Using these embeddings, we obtain several results, including the first known efficient solution for approximately solving nearest neighbor problems with permutations and the first known algorithms for finding permutation distances in the "data stream" model. 3. We consider a novel class of problems called permutation matching problems which are similar to string matching problems, except that the pattern is a permutation (rather than a string) and present linear or near-linear time algorithms for approximately solving permutation matching problems; in contrast, the corresponding string problems take significantly longer.
机译:如果将两个物种的遗传图谱建模为(同源)基因的排列,则可以将缺失,阻断移动,倒置等形式的染色体重排数目转换为一个这样的排列,将其转变为另一种排列,以此来衡量它们的进化。距离。受这种情况的影响,我们研究了计算排列之间的距离以及匹配序列中的排列,并从集合(“最近的邻居”)中找到最相似的排列的问题。我们采用一种通用的方法:以近似的距离保留方式将相关的置换距离嵌入到众所周知的向量空间中,并解决在已知空间上产生的问题。我们的结果如下:1.我们介绍了这些排列距离到已知空间的第一个已知的近似距离保持嵌入。 2.使用这些嵌入,我们获得了一些结果,包括第一个已知的有效解决方案,该解决方案可以近似地解决带有置换的最邻近问题,以及第一个已知的用于在“数据流”模型中找到置换距离的算法。 3.我们考虑一类新颖的问题,称为排列匹配问题,它与字符串匹配问题相似,不同之处在于模式是排列(而不是字符串),并且提出了线性或接近线性的时间算法来近似解决排列匹配问题;相反,相应的字符串问题花费的时间明显更长。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号