首页> 外文会议>International ACM SIGIR conference on research and development in information retrieval >Incremental Diversification for Very Large Sets: a Streaming-based Approach
【24h】

Incremental Diversification for Very Large Sets: a Streaming-based Approach

机译:大型集的增量多样化:基于流的方法

获取原文

摘要

Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set is NP-hard for the usual objectives. Existing greedy diversification algorithms require random access to the input set, rendering them impractical in the context of large result sets or continuous data. To solve this issue, we present a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream. Our approach exhibits a linear computation and constant memory complexity with respect to input size, without significant loss of diversification quality. In an extensive evaluation on several real-world data sets, we show the applicability and efficiency of our algorithm for large result sets as well as for continuous query scenarios such as news stream subscriptions.
机译:结果多样化是一种有效的方法,可以降低返回的结果都不满足用户查询意图的风险。它已经显示出可以大大减少查询放弃。另一方面,对于通常的目标而言,计算最佳多样性集是NP难的。现有的贪婪分散算法需要随机访问输入集,从而使其在大结果集或连续数据的情况下不切实际。为了解决这个问题,我们提出了一种新颖的多样化方法,该方法将输入视为流,并以增量方式处理每个元素,并在流中的任何点保持接近最佳的多样化集。我们的方法相对于输入大小表现出线性计算和恒定的存储复杂性,而不会显着降低分散质量。在对多个实际数据集的广泛评估中,我们展示了我们的算法对大型结果集以及连续查询方案(例如新闻流订阅)的适用性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号