首页> 外文会议>ACM international conference on information and knowledge management >Processing Continuous Text Queries Featuring Non-Homogeneous Scoring Functions
【24h】

Processing Continuous Text Queries Featuring Non-Homogeneous Scoring Functions

机译:处理具有非同一性评分功能的连续文本查询

获取原文

摘要

In this work we are interested in the scalable processing of content filtering queries over text item streams. In particular, we are aiming to generalize state of the art solutions with non-homogeneous scoring functions combining query-independent item importance with query-dependent content relevance. While such complex ranking functions are widely used in web search engines this is to our knowledge the first scientific work studying their usage in a continuous query scenario. Our main contribution consists in the definition and the evaluation of new efficient in-memory data structures for indexing continuous top-κ queries based on an original two-dimensional representation of text queries. We are exploring locally-optimal score bounds and heuristics that efficiently prune the search space of candidate top-κ query results which have to be updated at the arrival of new stream items. Finally, we experimentally evaluate memory/matching time trade-offs of these index structures. In particular we experimentally illustrate their linear scaling behavior with respect to the number of indexed queries.
机译:在这项工作中,我们对文本项流的内容过滤查询的可扩展处理感兴趣。特别是,我们旨在概括具有非同一性评分功能的技术解决方案,与查询依赖的内容相关性结合查询无关的项目重要性。虽然这种复杂的排名函数广泛用于网络搜索引擎,但这是我们了解在连续查询场景中研究其使用的第一个科学工作。我们的主要贡献包括基于文本查询的原始二维表示,用于索引连续的Top-κ查询的新高效内存数据结构的定义和评估。我们正在探索当地最佳的分数界和启发式,可有效地修剪候选顶级κ查询结果的搜索空间,这些结果必须在新流项目的到来时更新。最后,我们通过实验评估了这些索引结构的存储器/匹配时间权衡。特别地,我们通过实验示出了关于索引查询的数量的线性缩放行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号