首页> 外文期刊>Intelligent data analysis >A semantic textual similarity measurement model based on the syntactic-semantic representation
【24h】

A semantic textual similarity measurement model based on the syntactic-semantic representation

机译:基于语法 - 语义表示的语义文本相似性测量模型

获取原文
获取原文并翻译 | 示例
       

摘要

Measuring semantic textual similarity (STS) lies at the core of many applications in natural language processing (NLP). Recently, most models have considered semantic information or syntactic information, but seldom an unified model to make full use of these two kinds of information. Based on the knowledge from the trained word vectors, this paper proposes a semantic-embedded dependency tree (SEDT) model based on word2vec and glove, which can be treated as a syntactic-semantic representation. In consideration of the words in a sentence for the contribution of the semantic are different, this model extends the semantic-embedded dependency tree model to an enhanced semantic-embedded dependency tree (ESEDT). And a modified partial tree kernel (MPTK) is proposed to automatically extract the syntactic-semantic patterns in this tree. Because the syntactic information, semantic knowledge, and the contribution distribution of the word attention model are all considered in this model, it can measure more comprehensive sentence semantics to improve the accuracy of STS results. Finally, SEDT/E-SEDT is applied to SemEval semantic textual similarity tasks and evaluate its performance through two widely used benchmarks: the Pearson correlation coefficient and the Spearman correlation coefficient. The experimental results show that SEDT/E-SEDT can effectively improve the accuracies of sentence similarity judgments. Compared with the other similar methods to calculate the semantic similarity, such as some neural network models, SEDT/E-SEDT can obtain better performance on most dataset.
机译:测量语义文本相似性(STS)位于自然语言处理中许多应用程序的核心(NLP)。最近,大多数模型都考虑了语义信息或句法信息,但很少统一模型可以充分利用这两种信息。基于从训练有素的单词向量的知识,本文提出了一种基于Word2Vec和手套的语义嵌入依赖树(SEDT)模型,可以将其视为语法语义。考虑到语义贡献中的句子中的单词是不同的,该模型将语义嵌入的依赖树模型扩展到增强的语义嵌入依赖树(ESEDT)。并提出了一个修改的部分树内核(MPTK)以自动提取此树中的语法语义模式。因为句法信息,语义知识和LEGING模型的贡献分布都是在这个模型中考虑的,它可以测量更全面的句子语义来提高STS结果的准确性。最后,SEDT / E-SEDT应用于Semeval语义文本相似性任务,并通过两个广泛使用的基准评估其性能:Pearson相关系数和Spearman相关系数。实验结果表明,SEDT / E-SEDT可以有效提高判决性判断的准确性。与其他类似方法相比,计算语义相似性,例如一些神经网络模型,Sedt / E-Sedt可以在大多数数据集上获得更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号