A semantic textual similarity measurement model based on the syntactic-semantic representation

Tang Zhuo; Xiao Qi; Zhu Li; Li Kenli; Li Keqin

首页> 外文期刊>Intelligent data analysis >A semantic textual similarity measurement model based on the syntactic-semantic representation

【24h】

A semantic textual similarity measurement model based on the syntactic-semantic representation

机译：基于语法 - 语义表示的语义文本相似性测量模型

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Measuring semantic textual similarity (STS) lies at the core of many applications in natural language processing (NLP). Recently, most models have considered semantic information or syntactic information, but seldom an unified model to make full use of these two kinds of information. Based on the knowledge from the trained word vectors, this paper proposes a semantic-embedded dependency tree (SEDT) model based on word2vec and glove, which can be treated as a syntactic-semantic representation. In consideration of the words in a sentence for the contribution of the semantic are different, this model extends the semantic-embedded dependency tree model to an enhanced semantic-embedded dependency tree (ESEDT). And a modified partial tree kernel (MPTK) is proposed to automatically extract the syntactic-semantic patterns in this tree. Because the syntactic information, semantic knowledge, and the contribution distribution of the word attention model are all considered in this model, it can measure more comprehensive sentence semantics to improve the accuracy of STS results. Finally, SEDT/E-SEDT is applied to SemEval semantic textual similarity tasks and evaluate its performance through two widely used benchmarks: the Pearson correlation coefficient and the Spearman correlation coefficient. The experimental results show that SEDT/E-SEDT can effectively improve the accuracies of sentence similarity judgments. Compared with the other similar methods to calculate the semantic similarity, such as some neural network models, SEDT/E-SEDT can obtain better performance on most dataset.

机译：测量语义文本相似性（STS）位于自然语言处理中许多应用程序的核心（NLP）。最近，大多数模型都考虑了语义信息或句法信息，但很少统一模型可以充分利用这两种信息。基于从训练有素的单词向量的知识，本文提出了一种基于Word2Vec和手套的语义嵌入依赖树（SEDT）模型，可以将其视为语法语义。考虑到语义贡献中的句子中的单词是不同的，该模型将语义嵌入的依赖树模型扩展到增强的语义嵌入依赖树（ESEDT）。并提出了一个修改的部分树内核（MPTK）以自动提取此树中的语法语义模式。因为句法信息，语义知识和LEGING模型的贡献分布都是在这个模型中考虑的，它可以测量更全面的句子语义来提高STS结果的准确性。最后，SEDT / E-SEDT应用于Semeval语义文本相似性任务，并通过两个广泛使用的基准评估其性能：Pearson相关系数和Spearman相关系数。实验结果表明，SEDT / E-SEDT可以有效提高判决性判断的准确性。与其他类似方法相比，计算语义相似性，例如一些神经网络模型，Sedt / E-Sedt可以在大多数数据集上获得更好的性能。

著录项

来源
《Intelligent data analysis》 |2019年第4期|933-950|共18页
作者
Tang Zhuo; Xiao Qi; Zhu Li; Li Kenli; Li Keqin;
展开▼
作者单位

Hunan Univ Coll Comp Sci & Elect Engn Changsha Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha Hunan Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Semantic textual similarity; sentence structural representation; structural kernel; word embedding; attention mechanism;

机译：语义文本相似度;句子结构表示;结构核;词嵌入;注意机制;

相似文献

外文文献
中文文献
专利

1. Distributed representation and one-hot representation fusion with gated network for clinical semantic textual similarity [J] . Ying Xiong, Shuai Chen, Haoming Qin, BMC Medical Informatics and Decision Making . 2020,第1期

机译：具有门控网络的分布式表示和单热表示融合，用于临床语义文本相似性
2. Computing semantic similarity based on novel models of semantic representation using Wikipedia [J] . Qu Rong, Fang Yongyi, Bai Wen, Information Processing & Management . 2018,第6期

机译：使用Wikipedia基于新颖的语义表示模型计算语义相似度
3. Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity [J] . International journal of cognitive informatics and natural intelligence . 2020,第1期

机译：基于卷积神经网络的阿拉伯文本相似度分布语义模型
4. A Joint Syntactic-Semantic Representation for Recognizing Textual Relatedness [C] . Rui Wang, Yi Zhang, Guenter Neumann Text Analysis Conference . 2011

机译：用于识别文本相关性的联合句法语义
5. Computing the Semantic Textual Similarity of Clinical Notes [D] . Dara, Akanksha. 2021

机译：计算临床笔记的语义文本相似性
6. Distributed representation and one-hot representation fusion with gated network for clinical semantic textual similarity [O] . Ying Xiong, Shuai Chen, Haoming Qin, 2020

机译：门控网络的分布式表示和一站式表示融合用于临床语义文本相似度
7. Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study (Preprint) [O] . Ying Xiong, Shuai Chen, Qingcai Chen, 2020

机译：使用字符级和实体级别表示来增强基于变压器的临床语义文本相似性模型的双向编码器表示：临床电脑建模研究（预印）

A semantic textual similarity measurement model based on the syntactic-semantic representation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅