首页> 外文会议>Youth Academic Annual Conference of Chinese Association of Automation >A method of optimizing LDA result purity based on semantic similarity
【24h】

A method of optimizing LDA result purity based on semantic similarity

机译:一种基于语义相似性优化LDA结果纯度的方法

获取原文

摘要

The result purity of traditional LDA (Latent Dirichlet Allocation) is uninterpretable because it is always difficult to summarize the meaning of each LDA result topic which contains multiple irrelevant words. To solve the problem, a method of optimizing LDA result purity based on semantic similarity in streaming news processing is proposed. In this method, the Category Cluster Density (CCD) of each topic is calculated first, and those topics with lower CCD value were dropped to optimize the overall LDA result purity. The news clustering experiment results show that the vague news can be removed effectively and the reserved topics are interpretable than traditional method, which can significant optimize the LDA result purity automatically.
机译:传统LDA(潜在Dirichlet分配)的结果纯度是不可诠释的,因为总是难以总结包含多个无关单词的每个LDA结果主题的含义。为了解决问题,提出了一种基于流新闻处理中的语义相似性优化LDA结果纯度的方法。在此方法中,首先计算每个主题的类别群集密度(CCD),并且删除了CCD值较低的那些主题以优化整体LDA结果纯度。新闻集群实验结果表明,可以有效地消除模糊新闻,并将预留的主题是可解释的,而不是传统方法,这可以自动显着优化LDA结果纯度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号