【24h】

Time Makes Sense: Event Discovery in Twitter Using Temporal Similarity

机译:时间是有道理的:使用时间相似性的Twitter中的事件发现

获取原文

摘要

Temporal text mining (TTM) has recently attracted the attention of scientists as a mean to discover and track in real-time discussions in micro-blogs. However current approaches to temporal mining suffer from efficiency problems when applied to large micro-blog streams, like Twitter, now reaching an average of 500 million tweets per daay. We propose a technique, named SAX (based on an algorithm named Symbolic Aggregate Approximation) to discretize the temporal series of terms into a small set of levels, leading to a string for each terms. We then define a subset of "interesting" strings, i.e. Those representing patterns of collective attention. Sliding temporal windows are used to detect clusters of terms with the same string. We show that SAX is more efficient (by orders of magnitude) than other approaches to temporal mining in literature. In this paper, we experiment SAX on the task of event discovery over one year 1% world while Twitter stream.
机译:时间文本挖掘(TTM)最近引起了科学家的注意,作为在微博的实时讨论中发现和跟踪的意思。 然而,当应用于大型微博流时,当前对时间挖掘的方法遭受效率问题,如Twitter,现在达到平均每Daay的500万推文。 我们提出了一种名为SAX的技术(基于名为符号聚合近似的算法),以将时间系列术语分开到一小一组级别,导致每个术语的字符串。 然后,我们定义了“有趣”字符串的子集,即表示集体注意的模式。 滑动时间窗口用于检测具有相同字符串的术语的集群。 我们表明SAX更有效(按数量级)而不是其他文学中季度挖掘的方法。 在本文中,我们在Twitter流的同时,我们在一年内1%的世界中对萨克萨克进行了一年的任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号