...
首页> 外文期刊>ISPRS International Journal of Geo-Information >A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
【24h】

A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data

机译:一种基于不同密度的聚类方法,用于从异构Twitter数据进行事件检测

获取原文

摘要

Extracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cannot efficiently detect clusters when there is significant spatial heterogeneity in the dataset, as it is the case for Twitter data where the distribution of users, as well as the intensity of publishing tweets, varies over the study areas. This study proposes VDCT (Varied Density-based spatial Clustering for Twitter data) algorithm that extracts clusters from geotagged tweets by considering spatial heterogeneity. The algorithm employs exponential spline interpolation to determine different search radiuses for cluster detection. Moreover, in addition to spatial proximity, textual similarities among tweets are also taken into account by the algorithm. In order to examine the efficiency of the algorithm, geotagged tweets collected during a hurricane in the United States were used for event detection. The output clusters of VDCT have been compared to those of DBSCAN. Visual and quantitative comparison of the results proved the feasibility of the proposed method.
机译:通过在地理标记的推文上应用空间聚类从Twitter提取潜在知识,可以发现事件及其位置。 DBSCAN(带有噪声的基于应用程序的基于空间的空间聚类)已被广泛用于从带有地理标记的推文中检索事件,当数据集中存在明显的空间异质性时,它无法有效地检测到聚类,例如Twitter数据的分布情况在研究领域中,用户数量以及发布推文的强度各不相同。这项研究提出了VDCT(用于Twitter数据的基于变量密度的空间聚类)算法,该算法通过考虑空间异质性从地理标记推文中提取聚类。该算法采用指数样条插值来确定用于聚类检测的不同搜索半径。此外,除了空间上的接近性之外,该算法还考虑了推文之间的文本相似性。为了检查该算法的效率,将在美国飓风期间收集的带有地理标记的推文用于事件检测。 VDCT的输出群集已与DBSCAN的输出群集进行了比较。视觉和定量比较结果证明了该方法的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号