首页> 外文会议>International Conference on Semantics, Knowledge and Grids >Semantic and Heuristic Based Approach for Paraphrase Identification
【24h】

Semantic and Heuristic Based Approach for Paraphrase Identification

机译:基于语义和启发式的释义识别方法

获取原文
获取外文期刊封面目录资料

摘要

In this paper, we propose a semantic-based paraphrase identification approach. The core concept of this proposal is to identify paraphrases when sentences contain a set of named-entities and common words. The developed approach distinguishes the computation of the semantic similarity of named-entity tokens from the rest of the sentence text. More specifically, this is based on the integration of word semantic similarity derived from WordNet taxonomic relations, and named-entity semantic relatedness inferred from the crowd-sourced knowledge in Wikipedia database. Besides, we improve WordNet similarity measure by nominalizing verbs, adjectives and adverbs with the aid of Categorial Variation database (CatVar). The paraphrase identification system is then evaluated using two different datasets; namely, Microsoft Research Paraphrase Corpus (MSRPC) and TREC-9 Question Variants. Experimental results on the aforementioned datasets show that our system outperforms baselines in the paraphrase identification task.
机译:在本文中,我们提出了一种基于语义的释义识别方法。该建议的核心概念是在句子包含一组命名实体和常用词时识别释义。所开发的方法将句子实体其余部分与命名实体标记的语义相似度计算区分开来。更具体地说,这是基于从WordNet分类学关系派生的单词语义相似性与从Wikipedia数据库中从众包知识中推断出的命名实体语义相似性的集成。此外,我们借助类别变化数据库(CatVar)通过对动词,形容词和副词进行名词化来改进WordNet相似性度量。然后使用两个不同的数据集评估复述识别系统。即Microsoft研究释义语料库(MSRPC)和TREC-9问题变体。在上述数据集上的实验结果表明,我们的系统在复述识别任务中的性能优于基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号