首页> 外文会议>Joint conference on lexical and computational semantics >DeepPurple: Estimating Sentence Semantic Similarity using N-gram Regression Models and Web Snippets
【24h】

DeepPurple: Estimating Sentence Semantic Similarity using N-gram Regression Models and Web Snippets

机译:DeepPurple:使用N-gram回归模型和Web片段估计句子语义相似度

获取原文

摘要

We estimate the semantic similarity between two sentences using regression models with features: 1) n-gram hit rates (lexical matches) between sentences, 2) lexical semantic similarity between non-matching words, and 3) sentence length. Lexical semantic similarity is computed via co-occurrence counts on a corpus harvested from the web using a modified mutual information metric. State-of-the-art results are obtained for semantic similarity computation at the word level, however, the fusion of this information at the sentence level provides only moderate improvement on Task 6 of SemEval' 12. Despite the simple features used, regression models provide good performance, especially for shorter sentences, reaching correlation of 0.62 on the SemEval test set.
机译:我们使用具有以下特征的回归模型来估计两个句子之间的语义相似性:1)句子之间的n-gram命中率(词汇匹配),2)不匹配单词之间的词汇语义相似性和3)句子长度。词汇语义相似性是通过使用经过修改的互信息测度从网络收获的语料库上的共现计数来计算的。在单词级别的语义相似度计算中获得了最新的结果,但是,此信息在句子级别的融合仅对SemEval 12的任务6提供了适度的改进。提供良好的性能,尤其是对于较短的句子,在SemEval测试集上达到0.62的相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号