BinRank: Scaling Dynamic Authority-Based Search Using Materialized SubGraphs

机译：BinRank：使用物化子图扩展基于动态权限的搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Dynamic authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases and the Web. Conceptually, these algorithms require a query-time PageRank-style iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of pre-computed results for some or all keywords involves very expensive preprocessing. We introduce BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the sub-graphs. BinRank generates the sub-graphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a sub-graph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve sub-second query execution time on the English Wikipedia dataset, while producing high quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 10^8 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.

机译：基于动态权限的关键字搜索算法（例如ObjectRank和个性化PageRank）利用语义链接信息在数据库和Web中提供高质量，高召回率的搜索。从概念上讲，这些算法需要在整个图形上进行查询时的PageRank样式的迭代计算。这种计算对于大型图形而言过于昂贵，并且在查询时不可行。或者，为某些或所有关键字建立预先计算结果的索引会涉及非常昂贵的预处理。我们引入BinRank，该系统通过利用传统查询处理中受实例化视图启发的混合方法来近似ObjectRank结果。我们以这种方式具体化了数据图的一些相对较小的子集，从而可以通过仅在一个子图上运行ObjectRank来回答任何关键字查询。 BinRank通过根据语料库的共现性对语料库中的所有术语进行划分来生成子图，使用这些术语为每个分区执行ObjectRank以生成一组随机行走起点，并仅保留那些接收到不可忽略分数的对象。直觉是，包含与一组相关术语相关的所有对象和链接的子图应具有针对这些术语之一对对象进行排名所需的所有信息。我们证明BinRank可以在英语Wikipedia数据集上实现亚秒级的查询执行时间，同时产生与原始图上的ObjectRank结果非常接近的高质量搜索结果。 Wikipedia链接图包含大约10 ^ 8个边，比现有的基于动态权限的搜索系统能够展示的现有状态至少大两个数量级。我们的实验评估调查了查询执行时间，结果质量和BinRank的存储需求之间的权衡。

著录项

来源
《Data Engineering, ICDE, 2009 IEEE 25th International Conference on》|2009年|P.66-77|共12页
会议地点
作者
Hwang; Heasoo; Balmin; Andrey; Reinwald; Berthold; Nijkamp; Erik;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类工业技术;
关键词

相似文献

外文文献
中文文献
专利

1. BinRank: Scaling Dynamic Authority-Based Search Using Materialized Subgraphs [J] . Hwang Heasoo, Balmin Andrey, Reinwald Berthold, Knowledge and Data Engineering, IEEE Transactions on . 2010,第8期

机译：BinRank：使用物化子图扩展基于动态权限的搜索
2. Generating Msg’s by Binrank for Scaling in Dynamic Authority Based Search [J] . International Journal of Computer Science and Technology . 2011,第3期

机译：Binrank生成消息以扩展基于动态权限的搜索
3. GPU acceleration of subgraph isomorphism search in large scale graph [J] . 杨博, 卢凯, 高颖慧, 中南大学学报（英文版） . 2015,第006期

机译：大规模图中子图同构搜索的GPU加速
4. BinRank: Scaling Dynamic Authority-Based Search Using Materialized SubGraphs [C] . Hwang Heasoo, Balmin Andrey, Reinwald Berthold, IEEE International Conference on Data Engineering . 2009

机译：Binrank：使用物化子图缩放基于动态权限的搜索
5. Dynamic scale genetic algorithm: An enhanced genetic search for discrete optimization. [D] . Joshi, Bela Dange. 1996

机译：动态规模遗传算法：用于离散优化的增强型遗传搜索。
6. Brain structure and dynamics across scales: In search of rules [O] . Xiao-Jing Wang, Henry Kennedy -1

机译：跨尺度的大脑结构和动力学：寻找规则
7. Universality regained: Kibble-Zurek dynamics, off-equilibrium scaling and the search for the QCD critical point [O] . Mukherjee, Swagato, Venugopalan, Raju, Yin, Yi 2017

机译：普遍性重新获得：Kibble-Zurek动力学，非均衡缩放并寻找QCD临界点
8. Comparison of Monte Carlo Tree Search and Rolling Horizon Optimization for Large Scale Dynamic Resource Allocation Problems. [R] . Griffith, J. D., Bertsimas, D., Gupta, V., 2015

机译：蒙特卡罗树搜索与滚动水平优化的大规模动态资源配置问题比较。

BinRank: Scaling Dynamic Authority-Based Search Using Materialized SubGraphs

摘要

著录项

相似文献

相关主题

期刊订阅