首页> 外国专利> TEXT INFORMATION SIMILARITY MATCHING METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM

TEXT INFORMATION SIMILARITY MATCHING METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM

机译：文本信息相似性匹配方法和装置，计算机设备和存储介质

页面导航

摘要
著录项
相似文献

摘要

Provided are a TF-IDF-based text information similarity matching method and apparatus. The method comprises: acquiring text information; carrying out word segmentation on the text information to obtain segmented words w₁, w₂,..., w_n-1 and w_n; using a CBOW model to calculate word vectors V(w₁), V(w₂),..., V(w_n-1) and V(w_n) of the segmented words; using a TF-IDF algorithm to calculate TF-IDF values k₁, k₂,..., k_n-1 and k_n of the segmented words; obtaining a sentence vector V according to products of the word vectors of the segmented words and the corresponding TF-IDF values; and calculating the cosine similarity between the sentence vector V and sentence vectors of pre-stored statements, and determining a pre-stored statement having the maximum cosine similarity. By means of the process, a pre-stored statement that is most similar to text information can be found, and the accuracy of problem recognition can be improved in the aspects of robot conversation, information classification, etc., thus improving the conversation efficiency or the classification efficiency. Further provided are a computer device and a storage medium.

机译：提供了一种基于TF-IDF的文本信息相似度匹配方法及装置。该方法包括：获取文本信息;对文本信息进行分词以获得分词w _{1 ，w _{2 ，...，w _{n-1 和w < Sub> n ;使用CBOW模型计算单词向量V（w _{1 ），V（w _{2 ），...，V（w _{n-1 ）和分段词的V（w _{n ）;使用TF-IDF算法计算TF-IDF值k _{1 ，k _{2 ，...，k _{n-1 和k <分段词的Sub> n ;根据分割后的词的词向量与相应的TF-IDF值的乘积获得句子向量V;计算所述句子矢量V与所述预存语句的句子矢量之间的余弦相似度，并确定具有最大余弦相似度的预存储语句。通过该过程，可以找到与文本信息最相似的预存储语句，可以在机器人对话，信息分类等方面提高问题识别的准确性，从而提高对话效率或分类效率。还提供了一种计算机设备和存储介质。}}}}}}}}}}

著录项

公开/公告号WO2019196314A1

专利类型
公开/公告日2019-10-17

原文格式PDF
申请/专利权人 PING AN TECHNOLOGY (SHENZHEN) CO. LTD.;
展开▼

申请/专利号WO2018CN102855
发明设计人 ZHOU TAOTAO;ZHOU BAO;WANG JIANZONG;XIAO JING;
展开▼

申请日2018-08-29
分类号G06F17/30;
国家 WO
入库时间 2022-08-21 11:52:53

相似文献

专利
外文文献
中文文献