首页> 外文会议>IEEE international conference on data engineering >Incremental discovery of prominent situational facts
【24h】

Incremental discovery of prominent situational facts

机译:渐进式发现重要情况事实

获取原文

摘要

We study the novel problem of finding new, prominent situational facts, which are emerging statements about objects that stand out within certain contexts. Many such facts are newsworthy—e.g., an athlete's outstanding performance in a game, or a viral video's impressive popularity. Effective and efficient identification of these facts assists journalists in reporting, one of the main goals of computational journalism. Technically, we consider an ever-growing table of objects with dimension and measure attributes. A situational fact is a “contextual” skyline tuple that stands out against historical tuples in a context, specified by a conjunctive constraint involving dimension attributes, when a set of measure attributes are compared. New tuples are constantly added to the table, reflecting events happening in the real world. Our goal is to discover constraint-measure pairs that qualify a new tuple as a contextual skyline tuple, and discover them quickly before the event becomes yesterday's news. A brute-force approach requires exhaustive comparison with every tuple, under every constraint, and in every measure subspace. We design algorithms in response to these challenges using three corresponding ideas—tuple reduction, constraint pruning, and sharing computation across measure subspaces. We also adopt a simple prominence measure to rank the discovered facts when they are numerous. Experiments over two real datasets validate the effectiveness and efficiency of our techniques.
机译:我们研究发现新的,突出的情境事实的新颖问题,这些事实是关于在某些情况下脱颖而出的物体的新兴陈述。许多这样的事实具有新闻价值-例如,运动员在比赛中的出色表现,或病毒式视频的令人印象深刻的受欢迎程度。对这些事实的有效而有效的识别有助于记者进行报道,这是计算新闻业的主要目标之一。从技术上讲,我们考虑具有尺寸和度量属性的对象表的不断增长。情境事实是“上下文”天际元组,在比较一组度量属性时,它在上下文中相对于历史元组脱颖而出,该元组由涉及维度属性的联合约束指定。新的元组会不断添加到表中,以反映现实世界中发生的事件。我们的目标是发现将新元组限定为上下文天际线元组的约束-度量对,并在事件成为昨天的新闻之前迅速发现它们。蛮力方法需要与每个元组,在每个约束下,在每个度量子空间中进行详尽的比较。为了应对这些挑战,我们使用三种相应的思想来设计算法:元组缩减,约束修剪和跨度量子空间共享计算。当发现的事实数量众多时,我们还采用简单的突出度量对它们进行排名。在两个真实数据集上进行的实验验证了我们技术的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号