首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >On Summarization and Timeline Generation for Evolutionary Tweet Streams
【24h】

On Summarization and Timeline Generation for Evolutionary Tweet Streams

机译:进化推文流的概述和时间轴生成

获取原文
获取原文并翻译 | 示例

摘要

Short-text messages such as tweets are being created and shared at an unprecedented rate. Tweets, in their raw form, while being informative, can also be overwhelming. For both end-users and data analysts, it is a nightmare to plow through millions of tweets which contain enormous amount of noise and redundancy. In this paper, we propose a novel continuous summarization framework called Sumblr to alleviate the problem. In contrast to the traditional document summarization methods which focus on static and small-scale data set, Sumblr is designed to deal with dynamic, fast arriving, and large-scale tweet streams. Our proposed framework consists of three major components. First, we propose an online tweet stream clustering algorithm to cluster tweets and maintain distilled statistics in a data structure called tweet cluster vector (TCV). Second, we develop a TCV-Rank summarization technique for generating online summaries and historical summaries of arbitrary time durations. Third, we design an effective topic evolution detection method, which monitors summary-based/volume-based variations to produce timelines automatically from tweet streams. Our experiments on large-scale real tweets demonstrate the efficiency and effectiveness of our framework.
机译:诸如tweet之类的短消息正在以前所未有的速度创建和共享。以原始形式发布的推文虽然内容丰富,但也可能不胜枚举。对于最终用户和数据分析人员而言,浏览数百万条包含大量噪声和冗余的推文是一场噩梦。在本文中,我们提出了一个新颖的连续摘要框架Sumblr来缓解此问题。与专注于静态和小规模数据集的传统文档汇总方法相比,Sumblr旨在处理动态,快速到达的大规模推文流。我们提出的框架包括三个主要部分。首先,我们提出一种在线推文流聚类算法,以对推文进行聚类,并在称为推文聚类向量(TCV)的数据结构中维护提炼的统计数据。其次,我们开发了TCV-Rank汇总技术,用于生成任意持续时间的在线摘要和历史摘要。第三,我们设计了一种有效的主题演变检测方法,该方法可监视基于摘要/基于量的变化,以从推文流中自动生成时间线。我们在大规模真实推文上的实验证明了我们框架的效率和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号