Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding

Peng Min; Zhu Jiahui; Wang Hua; Li Xuhui; Zhang Yanchun; Zhang Xiuzhen; Tian Gang

首页> 外文期刊>ACM transactions on knowledge discovery from data >Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding

【24h】

Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding

机译：使用无监督的多视图分层嵌入在微博流中挖掘面向事件的主题

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article presents an unsupervised multi-view hierarchical embedding (UMHE) framework to sufficiently reveal the intrinsic topical knowledge in social events. Event-oriented topics are highly related to such events as it can provide explicit descriptions of what have happened in social community. In many real-world cases, however, it is difficult to include all attributes of microblogs, more often, textual aspects only are available. Traditional topic modelling methods have failed to generate event-oriented topics with the textual aspects, since the inherent relations between topics are often overlooked in these methods. Meanwhile, the metrics in original word vocabulary space might not effectively capture semantic distances. Our UMHE framework overcomes the severe information deficiency and poor feature representation. The UMHE first develops a multi-view Bayesian rose tree to preliminarily generate prior knowledge for latent topics and their relations. With such prior knowledge, we design an unsupervised translation-based hierarchical embedding method to make a better representation of these latent topics. By applying self-adaptive spectral clustering on the embedding space and the original space concomitantly, we eventually extract event-oriented topics in word distributions to express social events. Our framework is purely data-driven and unsupervised, without any external knowledge. Experimental results on TREC Tweets2011 dataset and Sina Weibo dataset demonstrate that the UMHE framework can construct hierarchical structure with high fitness, but also yield topic embeddings with salient semantics; therefore, it can derive event-oriented topics with meaningful descriptions.

机译：本文提出了一种无监督的多视图层次嵌入（UMHE）框架，以充分揭示社交事件中的固有主题知识。面向事件的主题与此类事件高度相关，因为它可以提供对社交社区中所发生事件的明确描述。但是，在许多实际情况下，很难涵盖微博的所有属性，而更多情况下，仅文本方面可用。传统的主题建模方法无法生成具有文本方面的面向事件的主题，因为在这些方法中，主题之间的固有关系经常被忽略。同时，原始单词词汇空间中的度量可能无法有效地捕获语义距离。我们的UMHE框架克服了严重的信息不足和特征表示不佳的问题。 UMHE首先开发了多视图贝叶斯玫瑰树，以初步生成有关潜在主题及其关系的先验知识。有了这样的先验知识，我们设计了一种无监督的基于翻译的分层嵌入方法，以更好地表示这些潜在主题。通过在嵌入空间和原始空间上同时应用自适应谱聚类，我们最终在单词分布中提取面向事件的主题来表达社交事件。我们的框架是纯粹的数据驱动和无监督的，无需任何外部知识。在TREC Tweets2011数据集和新浪微博数据集上的实验结果表明，UMHE框架可以构建高度适合的层次结构，但也可以产生具有突出语义的主题嵌入。因此，它可以派生具有有意义描述的面向事件的主题。

著录项

来源
《ACM transactions on knowledge discovery from data》 |2018年第3期|38.1-38.26|共26页
作者
Peng Min; Zhu Jiahui; Wang Hua; Li Xuhui; Zhang Yanchun; Zhang Xiuzhen; Tian Gang;
展开▼
作者单位

Wuhan Univ, Sch Comp, Wuhan 430072, Hubei, Peoples R China;

Wuhan Univ, Sch Comp, Wuhan 430072, Hubei, Peoples R China;

Victoria Univ, Ctr Appl Informat, Melbourne, Vic, Australia;

Wuhan Univ, Sch Informat Management, Wuhan 430072, Hubei, Peoples R China;

Victoria Univ, Ctr Appl Informat, Melbourne, Vic, Australia;

RMIT Univ, Sch CS&IT, Melbourne, Vic, Australia;

Wuhan Univ, Sch Comp, Wuhan 430072, Hubei, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Event-oriented topic; multi-view hierarchical embedding; unsupervised learning; Bayesian rose tree;

机译：面向事件的主题;多视图分层嵌入;无监督学习;贝叶斯玫瑰树;

相似文献

外文文献
中文文献
专利

1. Unsupervised adaptive microblog filtering for broad dynamic topics [J] . Walid Magdy, Tamer Elsayed Information Processing & Management . 2016,第4期

机译：适用于广泛动态主题的无监督自适应微博过滤
2. Identifying and tracking topic-level influencers in the microblog streams [J] . Su Sen, Wang Yakun, Zhang Zhongbao, Machine Learning . 2018,第3期

机译：识别和跟踪微博流中的主题级别影响者
3. A probabilistic method for emerging topic tracking in Microblog stream [J] . Huang Jiajia, Peng Min, Wang Hua, World Wide Web . 2017,第2期

机译：一种微博流中新兴话题跟踪的概率方法
4. Emerging Topic Detection from Microblog Streams Based on Emerging Pattern Mining* [C] . Min Peng, Shuang Ouyang, Jiahui Zhu, IEEE International Conference on computer supported cooperative work in design . 2018

机译：基于新兴模式挖掘的来自微博流的新兴主题检测*
5. Unsupervised Structural Embedding Methods for Efficient Collective Network Mining [D] . Heimann, Mark. 2020

机译：有效集体网络挖掘的无监督结构嵌入方法
6. Microblog Topic-Words Detection Model for Earthquake Emergency Responses Based on Information Classification Hierarchy [O] . Xiaohui Su, Shurui Ma, Xiaokang Qiu, 2021

机译：基于信息分类层次结构的地震应急响应的微博主题词检测模型
7. Mining Lexical Variants from Microblogs: An Unsupervised Multilingual Approach [O] . Alejandro Mosquera, Paloma Moreda 2015

机译：从微博中挖掘词汇变体：一种无监督的多语言方法

Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding

摘要

著录项

相似文献

相关主题

期刊订阅