首页> 外文会议>Web-age information management >Towards Effective Event Detection, Tracking and Summarization on Microblog Data
【24h】

Towards Effective Event Detection, Tracking and Summarization on Microblog Data

机译:致力于对微博数据进行有效的事件检测,跟踪和汇总

获取原文
获取原文并翻译 | 示例

摘要

Microblogging has become one of the most popular social Web applications in recent years. Posting short messages (i.e., a maximum of 140 characters) to the Web at any time and at any place lowers the usage barrier, accelerates the information diffusion process, and makes it possible for instant publication. Among those daily user-published posts, many are related to recent or real-time events occurring in our daily life. While microblog sites usually display a list of words representing the trend topics during a time period (e.g., 24 hours, a week or even longer) on their homepages, the topical words do not make any sense to let the users have a comprehensive view of the topic, especially for those without any background knowledge. Additionally, users can only open each post in the relevant list to learn the topic details. In this paper, we propose a unified workflow of event detection, tracking and summarization on microblog data. Particularly, we introduce novel features considering the characteristics of microblog data for topical words selection, and thus for event detection. In the tracking phase, a bipartite graph is constructed to capture the relationship between two events occurring at adjacent time. The matched event pair is grouped into an event chain. Furthermore, inspired by diversity theory in Web search, we are the first to summarize event chains by considering the content coverage and evolution over time. The experimental results show the effectiveness of our approach on microblog data.
机译:近年来,微博客已成为最受欢迎的社交Web应用程序之一。随时随地将短消息(即最多140个字符)发布到Web可以降低使用障碍,加快信息传播过程,并可以即时发布。在这些用户每日发布的帖子中,许多与我们日常生活中发生的近期或实时事件有关。虽然微博客网站通常会在其主页上的某个时间段(例如24小时,一周甚至更长)中显示代表趋势主题的单词列表,但这些主题单词没有任何意义,无法让用户全面了解主题,特别是对于那些没有任何背景知识的人。此外,用户只能打开相关列表中的每个帖子以了解主题详细信息。在本文中,我们提出了对微博数据进行事件检测,跟踪和汇总的统一工作流程。特别是,我们介绍了考虑微博客数据特征的新颖功能,用于主题词选择,从而进行事件检测。在跟踪阶段,构造二部图以捕获在相邻时间发生的两个事件之间的关系。匹配的事件对被分组为事件链。此外,受网络搜索多样性理论的启发,我们率先通过考虑内容的覆盖范围和随时间的演变来总结事件链。实验结果表明我们的方法对微博数据的有效性。

著录项

  • 来源
    《Web-age information management》|2011年|p.652-663|共12页
  • 会议地点 Wuhan(CN);Wuhan(CN)
  • 作者单位

    Apex Data Knowledge Management Lab, Shanghai Jiao Tong University No. 800 Dongchuan Road, Shanghai, 200240, China;

    Apex Data Knowledge Management Lab, Shanghai Jiao Tong University No. 800 Dongchuan Road, Shanghai, 200240, China;

    Apex Data Knowledge Management Lab, Shanghai Jiao Tong University No. 800 Dongchuan Road, Shanghai, 200240, China;

    Apex Data Knowledge Management Lab, Shanghai Jiao Tong University No. 800 Dongchuan Road, Shanghai, 200240, China;

    Apex Data Knowledge Management Lab, Shanghai Jiao Tong University No. 800 Dongchuan Road, Shanghai, 200240, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 信息处理(信息加工);
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号