MR-LDA: An Efficient Topic Model for Classification of Short Text in Big Social Data

Xiongwen Pang; Benshuai Wan; Huifang Li; Weiwei Lin

首页> 外文期刊>International journal of grid and high performance computing >MR-LDA: An Efficient Topic Model for Classification of Short Text in Big Social Data

【24h】

MR-LDA: An Efficient Topic Model for Classification of Short Text in Big Social Data

机译：MR-LDA：大社会数据中短文本分类的有效主题模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Latent Dirichlet Allocation(LDA) is an efficient method of text mining,but applying LDA directly to Chinese micro-blog texts will not work well because micro-blogs are more social, brief, and closely related with each other. Based on LDA, this paper proposes a Micro-blog Relation LDA model (MR-LDA), which takes the relations between Chinese micro-blog documents and other Chinese micro-blog documents into consideration to help topic mining in micro-blog. The authors extend LDA in the following two points. First, they aggregate several Chinese micro-blogs as a single micro-blog document to solve the problem of short texts. Second, they model the generation process of Chinese micro-blogs more accurately by taking relationship between micro-blog documents into consideration. MR-LDA is more suitable to model Chinese micro-blog data. Gibbs sampling method is borrowed to inference the model. Experimental results on actual datasets show that MR-LDA model can offer an effective solution to text mining for Chinese micro-blog.

机译：潜在狄利克雷分配法（LDA）是一种有效的文本挖掘方法，但是将LDA直接应用于中文微博客文本并不能很好地发挥作用，因为微博客之间的联系更加紧密，简短且紧密相关。本文基于LDA，提出了一种微博关系LDA模型（MR-LDA），该模型考虑了中文微博文档与其他中文微博文档之间的关系，以帮助微博中的主题挖掘。作者在以下两点扩展了LDA。首先，他们将多个中文微博汇总为一个微博文档，以解决短文本问题。其次，他们通过考虑微博文档之间的关系来更准确地模拟中文微博的生成过程。 MR-LDA更适合于对中国微博数据进行建模。借用Gibbs抽样方法推断模型。在实际数据集上的实验结果表明，MR-LDA模型可以为中文微博文本挖掘提供有效的解决方案。

著录项

来源
《International journal of grid and high performance computing》 |2016年第4期|100-113|共14页
作者
Xiongwen Pang; Benshuai Wan; Huifang Li; Weiwei Lin;
展开▼
作者单位

School of Computer, South China Normal University, Guangzhou, China;

Information Technological Department, Guangdong Nanhai Rural Commercial Bank Company Limited, Guangzhou, China;

School of Computer, South China Normal University, Guangzhou, China;

School of Computer Science and Engineering, South China University of Technology, Guangzhou, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Big Data; Latent Dirichlet Allocation; Micro-Blog; Social Network; Topic Mining;

机译：大数据;潜在狄利克雷分配微博;社交网络;主题挖掘;

相似文献

外文文献
中文文献
专利

1. Social emotion classification of short text via topic-level maximum entropy model [J] . Rao Yanghui, Xie Haoran, Li Jun, Information & Management . 2016,第8期

机译：基于主题级最大熵模型的短文本社交情感分类
2. Online Biterm Topic Model based short text stream classification using short text expansion and concept drifting detection [J] . Hu Xuegang, Wang Haiyan, Li Peipei Pattern recognition letters . 2018,第DECa1期

机译：使用短文本扩展和概念漂移检测的基于在线Biterm主题模型的短文本流分类
3. Multi-label dataless text classification with topic modeling [J] . Zha Daochen, Li Chenliang Knowledge and information systems . 2019,第1期

机译：具有主题建模的多标签DataLess文本分类
4. Topic evolution modeling in social media short texts based on recurrent semantic dependent CRP [C] . Yuhao Zhang, Wenji Mao, Daniel Zeng IEEE International Conference on Intelligence and Security Informatics . 2017

机译：基于递归语义依赖的CRP的社交媒体短文主题演化建模
5. Topic Modeling and Spam Detection for Short Text Segments in Web Forums [D] . Sun, Yingcheng. 2020

机译：网上论坛中短文本段的主题建模和垃圾邮件检测
6. Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis [O] . Rania Albalawi, Tet Hin Yeap, Morad Benyoucef 2020

机译：使用短文本数据的主题建模方法：比较分析
7. Dataless Short Text Classification Based on Biterm Topic Model and Word Embeddings [O] . Yi Yang, Hongan Wang, Jiaqi Zhu, 2020

机译：基于Biterm主题模型和Word Embeddings的DataLess简短文本分类
8. Text Classification of installation Support Contract Topic Models for Category Management. [R] . Sevier, W. C. 2018

机译：文本分类安装支持合同主题模型的类别管理。

MR-LDA: An Efficient Topic Model for Classification of Short Text in Big Social Data

摘要

著录项

相似文献

相关主题

期刊订阅