首页> 外国专利> TEXT SIMILARITY ACQUISITION METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

TEXT SIMILARITY ACQUISITION METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

机译:文本相似性采集方法和装置,以及电子设备和计算机可读存储介质

摘要

A text similarity acquisition method and apparatus, and an electronic device and a computer-readable storage medium, relating to the field of machine learning. The method comprises: splicing two texts to be subjected to similarity comparison so as to form a spliced text, wherein the two texts respectively form a first text segment and a second text segment in the spliced text (210); performing character segmentation and vectorization processing on the spliced text to acquire a character vector of each character in the spliced text (220); for each character in the spliced text, calculating and acquiring a feature vector of each character using the character vector of each character, wherein the feature vector of each character represents a similar feature of each character to the spliced text (230); and calculating the similarity between the first text segment and the second text segment using the feature vector of each character in the first text segment and the feature vector of each character in the second text segment, and acquiring a similarity value representing the similarity between the first text segment and the second text segment (240). By using the method, the acquisition accuracy of text similarity can be improved.
机译:文本相似性获取方法和装置,以及电子设备和计算机可读存储介质,与机器学习领域有关。该方法包括:拼接两个文本以进行相似性比较,以便形成拼接文本,其中两个文本分别在拼接文本中形成第一文本段和第二文本段(210);对拼接文本执行字符分割和矢量化处理,以获取拼接文本中每个字符的字符向量(220);对于拼接文本中的每个字符,使用每个字符的字符向量计算和获取每个字符的特征向量,其中每个字符的特征向量表示拼接文本的每个字符的类似特征(230);并使用第一文本段中的每个字符的特征向量和第二文本段中的每个字符的特征向量计算第一文本段和第二文本段之间的相似度,并获取表示第一个的相似性的相似性值文本段和第二个文本段(240)。通过使用该方法,可以提高文本相似性的采集准确性。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号