首页> 外文会议>IEEE International Conference on Information Systems and Computer Aided Education >A Deep Learning Based Method to Measure the Similarity of Long Text
【24h】

A Deep Learning Based Method to Measure the Similarity of Long Text

机译:基于深度学习的长文本相似度度量方法

获取原文

摘要

For complex text data, especially for long text data, in order to measure the text similarity, the traditional methods are not accurate enough. We found that it is mainly because the feature representation ability is not strong enough. To improve the accuracy of long text similarity, an algorithm based on pre-training deep learning model is proposed to extract features of long text. On the benchmark data set of THUCNews corpus, the accuracy of our method is 5.4% higher than that of the traditional algorithm. Besides, we perform ablation experiments to test the improvement of fine-tuning technology.
机译:对于复杂的文本数据,尤其是长文本数据,为了测量文本的相似度,传统方法不够准确。我们发现这主要是因为特征表示能力不够强。为了提高长文本相似度的准确性,提出了一种基于预训练深度学习模型的算法来提取长文本特征。在THUCNews语料库的基准数据集上,我们的方法的准确性比传统算法高5.4%。此外,我们进行消融实验以测试微调技术的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号