【24h】

Micro-Review Synthesis for Multi-entity Summarization

机译:多实体摘要的微审阅综合

获取原文

摘要

Location-based social networks (LBSNs), exemplified by Foursquare, are fast gaining popularity. One important feature of LBSNs is micro-review. Upon check-in at a particular venue, a user may leave a short review (up to 200 characters long), also known as a tip. These tips are an important source of information for others to know more about various aspects of an entity (e.g., restaurant), such as food, waiting time, or service. However, a user is often interested not in one particular entity, but rather in several entities collectively, for instance within a neighborhood or a category. In this paper, we address the problem of summarizing the tips of multiple entities in a collection, by way of synthesizing new micro-reviews that pertain to the collection, rather than to the individual entities per se. We formulate this problem in terms of first finding a representation of the collection, by identifying a number of "aspects" that link common threads across two or more entities within the collection. We express these aspects as dense subgraphs in a graph of sentences derived from the multi-entity corpora. This leads to a formulation of maximal multi-entity quasi-cliques, as well as a heuristic algorithm to find K such quasi-cliques maximizing the coverage over the multi-entity corpora. To synthesize a summary tip for each aspect, we select a small number of sentences from the corresponding quasi-clique, balancing conciseness and representativeness in terms of a facility location problem. Our approach performs well on collections of Foursquare entities based on localities and categories, producing more representative and diverse summaries than the baselines.
机译:以Foursquare为例的基于位置的社交网络(LBSN)正在迅速普及。 LBSN的重要特征之一是微观审查。在特定地点签到时,用户可能会留下简短的评论(最长200个字符),也称为小费。这些提示是其他人了解实体(例如餐厅)各个方面的更多信息的重要信息来源,例如食物,等待时间或服务。然而,用户通常不对一个特定实体感兴趣,而是对例如在邻里或类别内的几个实体共同感兴趣。在本文中,我们通过综合与馆藏相关的新的微观评论,而不是针对单个实体本身,解决了对馆藏中多个实体的提示进行汇总的问题。我们通过首先确定多个“方面”来联系集合中两个或多个实体之间的公共线程,首先找到集合的表示形式,以此来表述这个问题。我们将这些方面表示为来自多实体语料库的句子图中的密集子图。这导致了最大多实体准cliqus的表述,以及一种启发式算法,该算法找到了K个这样的准cliclis,从而使多实体语料库的覆盖范围最大化。为了为每个方面综合总结提示,我们从相应的准环境中选择了少量的句子,就设施位置问题而言,平衡了简洁性和代表性。我们的方法在基于位置和类别的Foursquare实体集合上表现良好,比基线产生更具代表性和多样性的摘要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号