首页> 外文期刊>ACM Transactions on Information Systems >Cost-Effective Online Trending Topic Detection and Popularity Prediction in Microblogging
【24h】

Cost-Effective Online Trending Topic Detection and Popularity Prediction in Microblogging

机译:具有成本效益的在线趋势趋势话题检测和微博中的流行度预测

获取原文
获取原文并翻译 | 示例
       

摘要

Identifying topic trends on microblogging services such as Twitter and estimating those topics' future popularity have great academic and business value, especially when the operations can be done in real time. For any third party, however, capturing and processing such huge volumes of real-time data in microblogs are almost infeasible tasks, as there always exist API (Application Program Interface) request limits, monitoring and computing budgets, as well as timeliness requirements. To deal with these challenges, we propose a cost-effective system framework with algorithms that can automatically select a subset of representative users in microblogging networks in offline, under given cost constraints. Then the proposed system can online monitor and utilize only these selected users' real-time microposts to detect the overall trending topics and predict their future popularity among the whole microblogging network. Therefore, our proposed system framework is practical for real-time usage as it avoids the high cost in capturing and processing full real-time data, while not compromising detection and prediction performance under given cost constraints. Experiments with real microblogs dataset show that by tracking only 500 users out of 0.6 million users and processing no more than 30,000 microposts daily, about 92% trending topics could be detected and predicted by the proposed system and, on average, more than 10 hours earlier than they appear in official trends lists.
机译:识别微博服务(例如Twitter)上的主题趋势并估计这些主题的未来流行度具有巨大的学术和商业价值,尤其是当操作可以实时进行时。但是,对于任何第三方而言,在微博中捕获和处理如此大量的实时数据几乎是不可行的任务,因为始终存在API(应用程序接口)请求限制,监控和计算预算以及及时性要求。为了应对这些挑战,我们提出了一种具有成本效益的系统框架,该算法具有的算法可以在给定的成本约束下,自动离线选择微博网络中代表用户的子集。然后,所提出的系统可以在线监视并仅利用这些选定用户的实时微博来检测总体趋势主题并预测其在整个微博网络中的未来流行度。因此,我们提出的系统框架对于实时使用是实用的,因为它避免了捕获和处理完整实时数据的高成本,同时在给定成本约束下不影响检测和预测性能。使用真实微博数据集进行的实验表明,通过跟踪60万用户中的仅500个用户,每天处理不超过30,000个微博,建议的系统平均可以提前10多个小时检测到并预测约92%的趋势主题。而不是出现在官方趋势列表中。

著录项

  • 来源
    《ACM Transactions on Information Systems》 |2017年第3期|18.1-18.36|共36页
  • 作者单位

    Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai, Peoples R China;

    Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai, Peoples R China;

    Santa Clara Univ, Dept Comp Engn, Santa Clara, CA USA;

    Aston Univ, Sch Engn & Appl Sci, Birmingham, W Midlands, England;

    Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai, Peoples R China;

    Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai, Peoples R China;

    Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Topic detection; prediction; microblogging; cost;

    机译:主题检测;预测;微博;成本;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号