【24h】

The similarity metric

机译:相似度指标

获取原文

摘要

A new class of metrics appropriate for measuring effective similarity relations between sequences, say one type of similarity per metric, is studied. We propose a new "normalized information distance", based on the noncomputable notion of Kolmogorov complexity, and show that it minorizes every metric in the class (that is, it is universal in that it discovers all effective similarities). We demonstrate that it too is a metric and takes values in [0, 1]; hence it may be called the similarity metric. This is a theory foundation for a new general practical tool. We give two distinctive applications in widely divergent areas (the experiments by necessity use just computable approximations to the target notions). First, we computationally compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we give fully automatically computed language tree of 52 different language based on translated versions of the "Universal Declaration of Human Rights".
机译:研究了适用于测量序列之间有效相似性关系的一类新的度量,即每个度量的一种相似性。我们基于不可争议的Kolmogorov复杂度的概念,提出了一个新的“规范化信息距离”,并表明它使该类中的每个度量最小化(也就是说,它是通用的,因为它发现了所有有效的相似性)。我们证明它也是一个度量标准,并采用[0,1]中的值;因此可以称为相似度指标。这是新的通用实用工具的理论基础。我们在广泛不同的领域中提供了两个独特的应用程序(实验必须仅使用对目标概念的可计算近似值)。首先,我们通过计算比较整个线粒体基因组并推断其进化历史。这导致了第一个完全自动计算的整个线粒体系统发育树。其次,我们根据《世界人权宣言》的翻译版本,自动计算出52种不同语言的语言树。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号