首页> 外文会议>Data Engineering (ICDE), 2010 >K nearest neighbor queries and kNN-Joins in large relational databases (almost) for free

【24h】

K nearest neighbor queries and kNN-Joins in large relational databases (almost) for free

机译：免费（大型）关系数据库中的K个最近邻居查询和kNN-Joins

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Finding the k nearest neighbors (kNN) of a query point, or a set of query points (kNN-Join) are fundamental problems in many application domains. Many previous efforts to solve these problems focused on spatial databases or stand-alone systems, where changes to the database engine may be required, which may limit their application on large data sets that are stored in a relational database management system. Furthermore, these methods may not automatically optimize kNN queries or kNN-Joins when additional query conditions are specified. In this work, we study both the kNN query and the kNN-Join in a relational database, possibly augmented with additional query conditions. We search for relational algorithms that require no changes to the database engine. The straightforward solution uses the user-defined-function (UDF) that a query optimizer cannot optimize.We design algorithms that could be implemented by SQL operators without changes to the database engine, hence enabling the query optimizer to understand and generate the ÃÂ¿bestÃÂ¿ query plan. Using only a small constant number of random shifts for databases in any fixed dimension, our approach guarantees to find the approximate kNN with only logarithmic number of page accesses in expectation with a constant approximation ratio and it could be extended to find the exact kNN efficiently in any fixed dimension. Our design paradigm easily supports the kNN-Join and updates. Extensive experiments on large, real and synthetic, data sets confirm the efficiency and practicality of our approach.

机译：查找查询点的k个最近邻居（kNN）或一组查询点（kNN-Join）是许多应用程序领域中的基本问题。解决这些问题的许多以前的努力都集中在空间数据库或独立系统上，其中可能需要更改数据库引擎，这可能会限制它们在关系数据库管理系统中存储的大型数据集上的应用。此外，当指定其他查询条件时，这些方法可能不会自动优化kNN查询或kNN-Joins。在这项工作中，我们在关系数据库中研究了kNN查询和kNN-Join，可能还会增加其他查询条件。我们搜索不需要更改数据库引擎的关系算法。直截了当的解决方案使用查询优化器无法优化的用户定义函数（UDF）。我们设计了可以由SQL运算符实现的算法，而无需更改数据库引擎，从而使查询优化器能够理解并生成Â最佳查询计划。对于任何固定维度的数据库，仅使用少量恒定的数据库随机移位，我们的方法就可以保证仅以对数的访问量以期望的近似比率找到仅具有页面访问次数的对数的近似kNN，并且可以扩展为有效地找到精确的kNN。任何固定尺寸。我们的设计范例轻松支持kNN-Join和更新。在大型，真实和综合数据集上进行的大量实验证实了我们方法的有效性和实用性。

著录项

来源
《Data Engineering (ICDE), 2010》|2010年|p.4-15|共12页
会议地点
作者
Bin Yao; Feifei Li; Kumar P.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类数据处理、数据处理系统;
关键词

相似文献

外文文献
中文文献
专利

1. Hybrid KNN-join: Parallel nearest neighbor searches exploiting CPU and GPU architectural features [J] . Michael Gowanlock Journal of Parallel and Distributed Computing . 2021,第Mara期

机译：Hybrid Knn-Join：并行最近的邻权搜索CPU和GPU架构功能
2. A crowd enabled approach for processing nearest neighbor and range queries in incomplete databases with accuracy guarantee [J] . Mahin Mehnaz Tabassum, Hashem Tanzima, Kabir Samia Pervasive and Mobile Computing . 2017,第期

机译：通过精度保证，在不完整的数据库中处理最近的邻居和范围查询的人群化方法
3. Top-n query processing in spatial databases considering bi-chromatic reverse fc-nearest neighbors [J] . Cha-Lun Li, En Tzu Wang, Guo-Jhu Huang, Information Systems . 2014,第juna期

机译：考虑双色反向fc最近邻的空间数据库中的top-n查询处理
4. K nearest neighbor queries and kNN-Joins in large relational databases (almost) for free [C] . Bin Yao, Feifei Li, Kumar P. Data Engineering (ICDE), 2010 . 2010

机译：免费（大型）关系数据库中的K个最近邻居查询和kNN-Joins
5. Efficient processing of k-nearest neighbor queries over relational databases: A cost-based optimization [D] . Ayanso, Anteneh W. 2004

机译：在关系数据库上高效处理k最近邻查询：基于成本的优化
6. Erratum: Wang Y. et al. A Privacy Preserving Scheme for Nearest Neighbor Query. Sensors 2018 18 2440 [O] . Yuhang Wang, Zhihong Tian, Hongli Zhang, 2019

机译：勘误：WangY.等人最近邻居查询的隐私保护方案。传感器2018182440
7. K Nearest Neighbor Queries and KNN-Joins in Large Relational Databases (Almost) for Free [O] . Bin Yao, Feifei Li, Piyush Kumar 2010

机译：大型关系数据库中的K个最近邻居查询和KNN联接（几乎）免费

K nearest neighbor queries and kNN-Joins in large relational databases (almost) for free

摘要

著录项

相似文献

相关主题

期刊订阅