首页> 中文期刊> 《复杂系统与复杂性科学》 >一种基于PL-LDA模型的主题文本网络构建方法

一种基于PL-LDA模型的主题文本网络构建方法

     

摘要

Labeled LDA能挖掘出给定主题下的单词概率分布,但却无法分析主题词之间的关联关系.采用PMI虽可计算两个单词的相互关系,但却和给定主题失去联系.受PMI在窗口中统计词对共现频率的启发,提出了一种PL-LDA(Pointwise Labeled LDA)主题模型,可计算给定主题下词对的联合概率分布,在航空安全报告数据集上的实验表明PL-LDA模型所得结果具有很好的解释性.利用PL-LDA构建了主题文本网络,该网络除能反映主题词分布外,还可展现它们之间的复杂关联关系.%Labeled LDA can mine words' probabilities under a given topic, however, it can't analyze the association relationships among these topic words.Although the correlation between word pairs can be calculated by utilizing PMI (Pointwise Mutual Information), their relationship to the given topic is lost.Motivated by the operation of counting word pairs in a fixed window used in PMI, this paper proposes a topic model called PL-LDA (Pointwise Labeled LDA), which can compute the joint probabilities between word pairs under a given topic.Experimental results on aviation safety reports show that this model achieves results with good interpretability.Based on the results of PL-LDA, this paper constructs a topic text network, which provides rich and effective information for analyzers including reflecting the distribution of topic words and displaying the complex relationships among them.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号