Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs

机译：学习聚焦分层主题模型，在微博中半监督

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Topic modeling approaches, such as Latent Dirichlet Allocation (LDA) and Hierarchical LDA (hLDA) have been used extensively to discover topics in various corpora. Unfortunately, these approaches do not perform well when applied to collections of social media posts. Further, these approaches do not allow users to focus topic discovery around subjectively interesting concepts. We propose the new Semi-Supervised Microblog-hLDA (SS-Micro-hLDA) model to discover topic hierarchies in short, noisy microblog documents in a way that allows users to focus topic discovery around interesting areas. We test SS-Micro-hLDA using a large, public collection of Twitter messages and Reddit social blogging site and show that our model outperforms hLDA, Constrained-hLDA, Recursive-rCRP and TSSB in terms of Pointwise Mutual Information (PMI) Score. Further, we test our model in terms of information entropy of held-out data and show that the new approach produces highly focused topic hierarchies.

机译：主题建模方法，例如潜在的Dirichlet分配（LDA）和分层LDA（HLDA）已广泛用于发现各种语料的主题。不幸的是，这些方法在应用于社交媒体帖子的集合时，这些方法并不符合良好。此外，这些方法不允许用户在主观有趣的概念周围专注于主题发现。我们提出了新的半监督微博-HLDA（SS-Micro-HLDA）模型，以发现短嘈杂的微博文档的主题层次结构，以便用户允许用户对焦于有趣区域的主题发现。我们使用大型公共的Twitter消息和Reddit Social Blogging站点测试SS-Micro-HLDA，并显示我们的模型以叉点互信息（PMI）得分而胜过HLDA，约束 - HLDA，RECUSUSIVE-RCRP和TSSB。此外，我们在列出数据的信息熵方面测试我们的模型，并显示新方法产生高度集中的主题层次结构。

著录项

来源
《Pacific-Asia Conference on Knowledge Discovery and Data Mining》|2015年||共12页
会议地点
作者
Anton Slutsky; Xiaohua Hu; Yuan An;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.131-53;
关键词
入库时间 2022-08-20 20:05:19

相似文献

外文文献
中文文献
专利

1. Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding [J] . Peng Min, Zhu Jiahui, Wang Hua, ACM transactions on knowledge discovery from data . 2018,第3期

机译：使用无监督的多视图分层嵌入在微博流中挖掘面向事件的主题
2. Unsupervised Concept Hierarchy Learning: A Topic Modeling Guided Approach [J] . V.S. Anoop, S. Asharaf, P. Deepak Procedia Computer Science . 2016,第1期

机译：无监督概念层次学习：主题建模指导方法
3. A content search method for security topics in microblog based on deep reinforcement learning [J] . Zhou Nan, Du Junping, Yao Xu, World Wide Web . 2020,第1期

机译：基于深度强化学习的微博安全主题内容搜索方法
4. Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs [C] . Anton Slutsky, Xiaohua Hu, Yuan An Pacific-Asia conference on knowledge discovery and data mining . 2015

机译：在微博中通过半监督学习集中的层次主题模型
5. Modeling customer -focused engineering program alignment by means of group consensus and analytical hierarchy process analysis. [D] . Hartmann, David Herbert. 2004

机译：通过小组共识和层次分析法，对以客户为中心的工程计划一致性建模。
6. Microblog Topic-Words Detection Model for Earthquake Emergency Responses Based on Information Classification Hierarchy [O] . Xiaohui Su, Shurui Ma, Xiaokang Qiu, 2021

机译：基于信息分类层次结构的地震应急响应的微博主题词检测模型
7. Unsupervised Terminological Ontology Learning based on Hierarchical Topic Modeling [O] . Zhu, Xiaofeng, Klabjan, Diego, Bless, Patrick 2017

机译：基于分层次的无监督术语本体学习主题建模

Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs

摘要

著录项

相似文献

相关主题

期刊订阅