首页> 外文会议>Conference on empirical methods in natural language processing >Using Content-level Structures for Summarizing Microblog Repost Trees
【24h】

Using Content-level Structures for Summarizing Microblog Repost Trees

机译:使用内容级别的结构汇总微博转发树

获取原文

摘要

A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on mi-croblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the re-posting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.
机译:微博转发树提供了有关其中描述的事件如何发展的有力线索。为了帮助社交媒体用户捕获微croblogging站点上事件的主要线索,我们提出了一种新颖的重新发布树摘要框架,该框架通过有效区分重新发布树上的两种消息(称为领导者和关注者),这些消息是从内容级别结构信息中得出的,即消息的内容和转发关系。为此,使用条件随机字段(CRF)模型来检测跨重新发布树路径的领导者。然后,我们提出一种基于随机游动的汇总模型的变体,以基于领导者检测的结果对重要消息进行排名和选择。为了减少从领导者检测中级联的错误传播,我们通过使用调整步骤增强随机游走来改进框架,该调整步骤用于在给定所有重新发布消息的情况下从领导者概率中进行采样。为了进行评估,我们构造了两个带注释的语料库,一个用于领导者检测,另一个用于重新发布树摘要。实验结果证实了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号