首页> 外文会议>Conference on empirical methods in natural language processing >Using Content-level Structures for Summarizing Microblog Repost Trees
【24h】

Using Content-level Structures for Summarizing Microblog Repost Trees

机译:使用内容级结构来汇总微博重新发布树

获取原文

摘要

A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on mi-croblogging sites, we propose a novel repost tree summarization framework by effectively differentiating two kinds of messages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summarization model to rank and select salient messages based on the result of leader detection. To reduce the error propagation cascaded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the re-posting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.
机译:微博转发树提供了有关如何在其中所述事件开发的强烈线索。为了帮助社交媒体用户捕捉MI-croblogging网站事件的主要线索,我们通过转贴的树木称为领导者和追随者,这是从内容层次结构信息导出有效区分两类消息提出了一种新转贴树总结框架,即,消息内容和重新发布关系。为此,条件随机字段(CRF)模型用于检测转换树路径的领导者。然后,我们提出了一种基于随机散步的摘要模型的变体来基于领导者检测的结果等级和选择突出的消息。为了减少从领导者检测级联的错误传播,我们通过使用调整步骤增强随机步行来改进框架,以便从领导概率从领导概率上采样给出所有重新发布消息。为了评估,我们构建了两个注释的语料库,一个用于领导者检测,另一个用于转发树摘要。实验结果证实了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号