首页> 外文会议>IEEE International Conference on Tools with Artificial Intelligence >Online Clustering for Topic Detection in Social Data Streams
【24h】

Online Clustering for Topic Detection in Social Data Streams

机译:用于社交数据流中主题检测的在线聚类

获取原文

摘要

Microblogs have become an important origin of information regarding events happening in a location during a time period. Analyzing and clustering these streams of short textual messages is an important research activity which is attracting the interest of both public and private organizations, since the extracted knowledge can be exploited to enhance the comprehension of people behavior and the onset of emergency situations. Clustering these streams requires efficient algorithms capable of analyzing this continuos deluge of data. The paper proposes an online algorithm that incrementally groups tweet streams into clusters. The approach summarizes the examined tweets into the cluster centroids generated so far. The assignment of a tweet to a centroid uses a similarity measure that takes into account both the cluster age and the terms occurring in the tweet. Experiments on messages posted by users in the Manhattan area show that the method is able to extract events effectively taking place in the examined period.
机译:微博客已成为有关一段时间内某个位置发生的事件的重要信息来源。分析和聚集这些短文本消息流是一项重要的研究活动,吸引了公共和私人组织的兴趣,因为可以利用提取的知识来增强对人的行为的理解和紧急情况的发作。对这些流进行聚类需要有效的算法,该算法必须能够分析这种连续的数据泛滥。本文提出了一种在线算法,该算法可将推文流递增地分组为簇。该方法将检查的推文汇总为到目前为止生成的簇质心。将推文分配给质心时使用相似性度量,该度量考虑了群集年龄和推文中出现的术语。对曼哈顿地区用户发布的消息进行的实验表明,该方法能够有效提取事件发生期间的事件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号