An Experimental Study on Learning with Good Edit Similarity Functions

机译：具有良好编辑相似度功能的学习实验研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Similarity functions are essential to many learning algorithms. To allow their use in support vector machines (SVM), i.e., for the convergence of the learning algorithm to be guaranteed, they must be valid kernels. In the case of structured data, the similarities based on the popular edit distance often do not satisfy this requirement, which explains why they are typically used with k-nearest neighbor (k-NN). A common approach to use such edit similarities in SVM is to transform them into potentially (but not provably) valid kernels. Recently, a different theory of learning with (e,g,t) -good similarity functions was proposed, allowing the use of non-kernel similarity functions. Moreover, the resulting models are supposedly sparse, as opposed to standard SVM models that can be unnecessarily dense. In this paper, we study the relevance and applicability of this theory in the context of string edit similarities. We show that they are naturally good for a given string classification task and provide experimental evidence that the obtained models not only clearly outperform the k-NN approach, but are also competitive with standard SVM models learned with state-of-the-art edit kernels, while being much sparser.

机译：相似性函数对于许多学习算法都是必不可少的。为了允许它们在支持向量机（SVM）中使用，即，为了保证学习算法的收敛性，它们必须是有效的内核。在结构化数据的情况下，基于常用编辑距离的相似性通常无法满足此要求，这解释了为什么它们通常与k最近邻居（k-NN）一起使用。在SVM中使用此类编辑相似性的常见方法是将它们转换为潜在（但无法证明）有效的内核。最近，提出了一种具有（e，g，t）-良好相似性函数的不同学习理论，从而允许使用非内核相似性函数。而且，与可能不必要地密集的标准SVM模型相反，生成的模型据称是稀疏的。在本文中，我们在字符串编辑相似性的背景下研究了该理论的相关性和适用性。我们证明了它们对于给定的字符串分类任务自然是有好处的，并提供实验证据表明所获得的模型不仅明显胜过k-NN方法，而且与通过最新编辑内核学习的标准SVM模型相比也具有竞争力，而变得稀疏。

著录项

来源
《2011 23rd IEEE International Conference on Tools with Artificial Intelligence》|2011年|p.126-133|共8页
会议地点
作者
Bellet Aurelien; Sebban Marc; Habrard Amaury;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
edit distance; linear program; similarity function; sparsity; structured data;

机译：编辑距离;线性程序;相似度函数;稀疏度;结构化数据;

相似文献

外文文献
中文文献
专利

1. Learning discriminative tree edit similarities for linear classification-Application to melody recognition [J] . Bellet Aurelien, Bernabeu Jose F., Habrard Amaury, Neurocomputing . 2016,第nova19期

机译：学习用于线性分类的判别树编辑相似性-应用于旋律识别
2. Good edit similarity learning by loss minimization [J] . Aurelien Belief, Amaury Habrard, Marc Sebban Machine Learning . 2012,第1a2期

机译：通过减少损失来进行良好的编辑相似性学习
3. Learning with similarity functions: A novel design for the extreme learning machine [J] . Gastaldo P., Bisio F., Gianoglio C., Neurocomputing . 2017,第octa25期

机译：具有相似功能的学习：极限学习机的新颖设计
4. An Experimental Study on Learning with Good Edit Similarity Functions [C] . Bellet Aurelien, Sebban Marc, Habrard Amaury International Conference on Tools with Artificial Intelligence . 2011

机译：良好编辑相似函数学习的实验研究
5. An experimental study of fractal and multifractal scale similarity in turbulent flows. [D] . Frederiksen, Richard David. 1996

机译：湍流中分形和多分形尺度相似性的实验研究。
6. RNA editing of mat-r transcripts in maize and soybean increases similarity of the encoded protein to fungal and bryophyte group II intron maturases: evidence that mat-r encodes a functional protein. [O] . M C Thomson, J L Macfarlane, C T Beagley, 1994

机译：玉米和大豆中mat-r转录本的RNA编辑增加了编码的蛋白质与真菌和苔藓类植物群II内含子成熟酶的相似性：证据表明mat-r编码功能性蛋白质。
7. An Experimental Study on Learning with Good Edit Similarity Functions [O] . Marc Sebban, Amaury Habrard 2016

机译：具有良好编辑相似度函数的学习实验研究

An Experimental Study on Learning with Good Edit Similarity Functions

摘要

著录项

相似文献

相关主题

期刊订阅