首页> 外文期刊>Complexity >Design and Implementation of English Intelligent Communication Platform Based on Similarity Algorithm
【24h】

Design and Implementation of English Intelligent Communication Platform Based on Similarity Algorithm

机译:基于相似性算法的英语智能通信平台的设计与实现

获取原文
           

摘要

Intelligent communication processing in English aims to obtain effective information from unstructured text data using various text processing techniques. Text vector representation and text similarity calculation are important fundamental tasks in the whole field of natural language processing. In response to the shortcomings of existing sentence vector representation models and the singularity of text similarity algorithms, improved models and algorithms are proposed based on a thorough study of related domain technologies. This paper presents an in-depth and comprehensive study of text vectorization representation and text similarity calculation algorithms in the field of natural language processing. The existing text vectorized representation models and text similarity computation algorithms are described, and their shortcomings are summarized to provide a basis for the background and significance of this paper, as well as to provide ideas for improvement directions. It is experimentally verified that the sentence vector model proposed in this paper achieves higher accuracy than the SIF sentence vector model for text classification tasks. In the task of text similarity computation, it achieves better results in three evaluation metrics: accuracy, recall, and F1 value. The algorithm also improves the computational efficiency of the model to a certain extent by removing feature words with low feature contribution. The algorithm first improves the deficiencies of the traditional word-shift distance algorithm by defining multifeature fusion weights and realizes a text similarity calculation algorithm based on multifeature weighted fusion with better similarity calculation results. Then, a linear weighting model is constructed to further combine the similarity calculation results of the hierarchical pooled IIG-SIF sentence vectors to realize the multimodel fusion text similarity calculation algorithm.
机译:英语中的智能通信处理旨在使用各种文本处理技术从非结构化文本数据获取有效信息。文本矢量表示和文本相似性计算是自然语言处理的整个领域的重要基础任务。为了应对现有的句子向量表示模型的缺点和文本相似性的算法的奇异性,改进的模型和基于相关领域技术的深入研究,提出的算法。本文介绍了自然语言处理领域文本矢量化表示和文本相似性计算算法的深入和全面研究。描述了现有文本矢量化表示模型和文本相似度计算算法,并且总结了它们的缺点,为本文的背景和意义提供了基础,以及提供改进方向的想法。据实验证明,本文提出的句子向量模型实现了比文本分类任务的SIF句子向量模型更高的精度。在文本相似性计算的任务中,它在三个评估指标中获得了更好的结果:准确性,召回和F1值。该算法还通过删除具有低特征贡献的特征词来提高模型的计算效率。该算法首先通过定义多端分焦融合权重算法来改善传统词移距离算法的缺陷,并通过更好的相似性计算结果实现基于多因素加权融合的文本相似性计算算法。然后,构造线性加权模型以进一步组合分层汇总的IIG-SIF句子矢量的相似性计算结果以实现多模型融合文本相似度计算算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号