【24h】

Sentence Similarity Based on Semantic Vector Model

机译:基于语义向量模型的句子相似度

获取原文

摘要

Sentence similarity measures play an increasingly important role in text-related research and applications in areas such as text mining, Web page retrieval, and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high-dimensional space and are consequently inefficient, require human input, and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information, structure information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database, How-net. The use of a lexical database enables our method to model human common sense knowledge. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows higher accuracy than other methods.
机译:句子相似度度量在诸如文本挖掘,网页检索和对话系统等领域中与文本相关的研究和应用中扮演着越来越重要的角色。已经从用于长文本文档的方法中采用了用于计算句子相似度的现有方法。这些方法在非常高的空间中处理句子,因此效率低下,需要人工输入,并且不适用于某些应用程序领域。本文直接关注于计算句子长度非常短的文本之间的相似度。它提出了一种算法,该算法考虑了句子中暗含的语义信息,结构信息和单词顺序信息。两个句子的语义相似性是使用来自结构化词汇数据库How-net的信息来计算的。词汇数据库的使用使我们的方法能够对人类常识知识进行建模。所提出的方法可以用于涉及文本知识表示和发现的各种应用中。在两组选定的句子对上进行的实验表明,所提出的方法提供了一种相似性度量,该相似性度量显示出比其他方法更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号