...
首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy
【24h】

Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy

机译:通过增加词多义性估计的PMI来提高词的相似性

获取原文
获取原文并翻译 | 示例
           

摘要

Pointwise mutual information (PMI) is a widely used word similarity measure, but it lacks a clear explanation of how it works. We explore how PMI differs from distributional similarity, and we introduce a novel metric, $({rm PMI}_{max})$, that augments PMI with information about a word's number of senses. The coefficients of $({rm PMI}_{max})$ are determined empirically by maximizing a utility function based on the performance of automatic thesaurus generation. We show that it outperforms traditional PMI in the application of automatic thesaurus generation and in two word similarity benchmark tasks: human similarity ratings and TOEFL synonym questions. $({rm PMI}_{max})$ achieves a correlation coefficient comparable to the best knowledge-based approaches on the Miller-Charles similarity rating data set.
机译:点向互信息(PMI)是一种广泛使用的单词相似性度量,但缺乏对其工作方式的清晰解释。我们探讨了PMI与分布相似度之间的区别,并引入了一种新颖的度量$({rm PMI} _ {max})$,该度量通过有关词义的信息来增强PMI。 $({rm PMI} _ {max})$的系数是通过基于自动同义词库生成的性能最大化效用函数来经验确定的。我们证明它在自动同义词库的应用和两个词相似性基准测试任务中的性能优于传统的PMI:人类相似性评分和TOEFL同义词问题。 $({rm PMI} _ {max})$获得的相关系数可与Miller-Charles相似性评级数据集上基于最佳知识的方法相比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号