首页> 外文会议>Conference on computational natural language learning >Improving Pointwise Mutual Information (PMI) by Incorporating Significant Co-occurrence
【24h】

Improving Pointwise Mutual Information (PMI) by Incorporating Significant Co-occurrence

机译:通过合并重要的同现来改善逐点相互信息(PMI)

获取原文

摘要

We design a new co-occurrence based word association measure by incorporating the concept of significant cooccurrence in the popular word association measure Pointwise Mutual Information (PMI). By extensive experiments with a large number of publicly available datasets we show that the newly introduced measure performs better than other co-occurrence based measures and despite being resource-light, compares well with the best known resource-heavy distributional similarity and knowledge based word association measures. We investigate the source of this performance improvement and find that of the two types of significant co-occurrence - corpus-level and document-level, the concept of corpus level significance combined with the use of document counts in place of word counts is responsible for all the performance gains observed. The concept of document level significance is not helpful for PMI adaptation.
机译:我们通过在流行单词关联度量中的点向互信息(PMI)中纳入显着共现的概念,设计了一种新的基于共现的单词关联度量。通过对大量公开可用数据集进行的广泛实验,我们表明,新引入的度量比其他基于同现的度量表现更好,并且尽管资源较少,但与最著名的资源繁多的分布相似性和基于知识的单词联想相比较措施。我们调查了这种性能改进的来源,发现在语料库水平和文档水平这两种重要的同时发生中,语料库水平重要性的概念与使用文档计数代替单词计数相结合是造成这种情况的原因。观察到的所有性能提升。文档级别重要性的概念对PMI的适应没有帮助。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号