首页> 中文期刊>中文信息学报 >基于主题相似度的宏观篇章主次关系识别方法

基于主题相似度的宏观篇章主次关系识别方法

     

摘要

Discourse analysis is an important task in the field of natural language processing.The analysis of primary and secondary relations at discourse-level helps to understand the discourse structure and semantics.Based on the research of micro discourse-level primary and secondary relation recognition,this paper aims at macro discourse-level primary and secondary relation and provides a recognition model based on topic similarity with word2vec and LDA. The topic similarity based on word2vce and the topic similarity based on LDA calculate the semantic similarity on different dimensions.They are complementary at the semantic level,which enhances the ability of the model to rec-ognize the macro discourse-level primary and secondary relations.Experimental results on the Macro Chinese Dis-course TreeBank(MCDTB)show that our model achieves 79.9% in F1-score,and 81.82% in accuracy,which im-proves the baseline by 1.7% and 1.81%,respectively.%篇章分析是自然语言处理领域的一个重要任务.分析篇章主次关系有助于理解篇章的结构和语义,并为自然语言处理的应用提供有力的支持.该文在微观篇章主次关系识别研究的基础上,重点研究宏观篇章主次关系,提出了一种基于word2vec和LDA的主题相似度的宏观篇章主次关系识别模型.基于word2vec的主题相似度和基于LDA的主题相似度在不同维度上计算语义相似度,两者在语义层面形成互补,因而增强了模型识别宏观篇章主次关系的能力.该模型在宏观汉语篇章树库(MCDTB)上实验的 F1值达到79.9%,正确率达到81.82%,相较基准系统分别提升了1.7% 和1.81%.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号