首页> 中文期刊> 《应用科学学报 》 >基于DTS-ILDA模型和关联过滤的新闻话题演化分析

基于DTS-ILDA模型和关联过滤的新闻话题演化分析

             

摘要

在话题演化跟踪领域,主题模型中时间片大小和主题数K值固定导致无法发掘重要时间转折点,为此提出一种动态时序分割无限潜在狄利克雷分配(dynamic temporal segmentation-infinite latent Dirichlet allocation,DTS-ILDA)模型.对于演化分析中容易产生错误话题关联的问题,提出一种关联过滤机制.首先运用DTS-ILDA模型提取主题,将改进动态时间分割算法与无限潜在狄利克雷分配(infinite latent Dirichlet allocation,ILDA)模型进行融合.动态时间分割算法按时间顺序遍历数据集,根据列联表分析前后时间片主题分布情况以衡量分割效果,从而找到合适的时间片分割点;ILDA模型可在各时间片内提取不同数量话题并对提取出的主题进行演化关联分析,然后用关键过滤方法滤除关联性不强的关联关系,最后按照时间顺序关系为剩余的关联建立子话题的5种演化关系图.实验表明:该方法能有效找到主题内容发生重要变化的时间点,防止产生无意义话题,同时减少错误话题关联干扰,挖掘出准确的话题深层次关系.%In topic evolution and tracking,as the size of time slices and the K value of the topic model are fixed,it is hard to locate important time turning points,which is prone to error topic correlation in the evolutionary analysis.To solve the problem,we propose an improved dynamic temporal segmentation-infinite latent Dirichlet allocation (DTS-ILDA)model and an associated filtering mechanism.The model combines an improved dynamic time segmentation algorithm with an infinite latent Dirichlet allocation (ILDA) model to extract topics.Dynamic time segmentation algorithm traverses the data set according to the time sequence,and then uses a contingency table to analysis the distribution of topics to measure the segmentation results and an ILDA model to extract K topics.In adldition,an association filtering mechanism is proposed for error prone association in the evolutionary analysis.It removes weak association relationship.Finally,five evolutionary relationships of right subtopic association are established according to the time sequence relationship.Experiments show that the presented method can effectively find important time points when the main content of the topic changes,preventing generation of meaningless topics.It can also reduce error-topic related interference,extracting exact deep relationship between the topics.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号