首页> 外文会议>International Conference on Information, Intelligence, Systems and Applications >Identifying semantically meaningful sub-communities within Twitter blogosphere
【24h】

Identifying semantically meaningful sub-communities within Twitter blogosphere

机译:在Twitter博客圈中识别语义上有意义的子社区

获取原文

摘要

This paper addresses the problem of semantically meaningful group detection within a sub-community of twitter micro-bloggers by utilizing a topic modeling, multi-objective clustering approach. The proposed group detection method is anchored on the Latent Dirichlet Allocation (LDA) topic modeling technique, aiming at identifying clusters of twitter users that are optimal in terms of both spatial and topical compactness. Specifically, the group detection problem is formulated as a multi-objective optimization problem taking into consideration two complementary cluster formation directives. The first objective, related to spatial compactness, is achieved by minimizing the overall deviation from the corresponding cluster centers. The second, related to topical compactness, is achieved by minimizing the portion of probability mass assigned to low probability topics for the corresponding cluster centroids. In our approach, optimization is performed by employing a multi-objective genetic algorithm, which results in a variety of cluster structures that are significantly more interpretable than cluster assignments obtained with traditional single-objective clustering algorithms.
机译:本文通过使用主题建模,多目标聚类方法解决了Twitter微博客子社区中语义上有意义的组检测问题。所提出的组检测方法基于潜在狄利克雷分配(LDA)主题建模技术,旨在识别在空间和主题紧凑性方面均最佳的Twitter用户集群。具体来说,考虑到两个互补的簇形成指令,将组检测问题表述为多目标优化问题。与空间紧凑性有关的第一个目标是通过使与相应聚类中心的总体偏差最小化来实现的。第二个问题与主题紧凑性有关,它是通过最小化分配给相应聚类质心的低概率主题的概率质量部分来实现的。在我们的方法中,优化是通过使用多目标遗传算法来执行的,这导致了比传统单目标聚类算法获得的聚类分配更可解释的多种聚类结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号