首页> 外文期刊>Machine Learning >Good edit similarity learning by loss minimization
【24h】

Good edit similarity learning by loss minimization

机译:通过减少损失来进行良好的编辑相似性学习

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Similarity functions are a fundamental component of many learning algorithms. When dealing with string or tree-structured data, measures based on the edit distance are widely used, and there exist a few methods for learning them from data. However, these methods offer no theoretical guarantee as to the generalization ability and discriminative power of the learned similarities. In this paper, we propose an approach to edit similarity learning based on loss minimization, called GESL. It is driven by the notion of (∈,γ, τ)-goodness, a theory that bridges the gap between the properties of a similarity function and its performance in classification. Using the notion of uniform stability, we derive generalization guarantees that hold for a large class of loss functions. We also provide experimental results on two real-world datasets which show that edit similarities learned with GESL induce more accurate and sparser classifiers than other (standard or learned) edit similarities.
机译:相似度函数是许多学习算法的基本组成部分。在处理字符串或树形数据时,基于编辑距离的度量被广泛使用,并且存在一些从数据中学习它们的方法。但是,这些方法对于所学相似性的泛化能力和判别力没有提供理论上的保证。在本文中,我们提出了一种基于损失最小化的相似性学习编辑方法,称为GESL。它是由(ε,γ,τ)-良好性概念驱动的,该理论弥合了相似性函数的属性与其分类性能之间的差距。使用均匀稳定性的概念,我们推导了适用于大量损失函数的泛化保证。我们还在两个真实的数据集上提供了实验结果,这些数据表明与其他(标准或学习的)编辑相似度相比,使用GESL学习的编辑相似度可诱导出更准确和更稀疏的分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号