MapReduce框架下基于R-树的k-近邻连接算法

刘义; 景宁; 陈荦; 熊伟

首页> 中文期刊> 《软件学报》 >MapReduce框架下基于R-树的k-近邻连接算法

MapReduce框架下基于R-树的k-近邻连接算法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

针对大规模空间数据的高性能k-近邻连接查询处理，研究了MapReduce框架下基于R-树索引的k-近邻连接查询处理。首先利用无依赖并行和串行同步计算的形式化定义抽象了MapReduce并行编程模型，基于此并行计算模型抽象，分别提出了 R-树索引快速构建算法和基于 R-树的并行 k-近邻连接算法。在索引构建过程中，提出一种采样算法以快速确立空间划分函数，使得索引构建符合无依赖并行和串行同步计算抽象，在MapReduce框架下非常容易进行表达。在k-近邻连接查询过程中，基于构建的分布式R-树索引，引入k-近邻扩展框限定查询范围并进行数据划分，然后利用 R-树索引进行 k-近邻连接查询，提高了查询效率。从理论上分析了所提出算法的通信和计算代价。实验与分析结果表明，该算法在真实数据集的查询上具有良好的效率和可扩展性能，可以很好地支持大规模空间数据的k-近邻连接查询处理，具有良好的实用价值。%To accelerate the k-nearest neighbor join (knnJ) query for large scale spatial data, the study presents a knnJ based on R-tree in MapReduce. First, the research uses the formalization of independent parallelism and sequential synchronization (IPSS) computation to abstract MapReduce parallel program model. Next, based on this parallel model abstraction, this paper proposes efficient algorithms for bulk building R-tree and performing knnJ query based on the constructed R-tree respectively. In the process of bulk building R-tree, a sampling algorithm is provided to determine the spatial partition function rapidly, which make the process of building R-tree conform to IPSS model and can be expressed easily in MapReduce. In the process of knnJ query, the knn expanded bounding box is introduced to limit the knn query range and partition data, and then the generated R-tree is used to execute knnJ query in parallel fashion, achieving high performance. This paper analyzes the communication and computation cost in theory. Experimental results and analysis in large real spatial data demonstrate that the algorithm can efficiently resolve the large scale knnJ spatial query in MapReduce environment, and has a good practical application.

著录项

来源
《软件学报》 |2013年第8期|1836-1851|共16页
作者
刘义; 景宁; 陈荦; 熊伟;
展开▼
作者单位

国防科学技术大学电子科学与工程学院;

湖南长沙 410073;

国防科学技术大学电子科学与工程学院;

湖南长沙 410073;

国防科学技术大学电子科学与工程学院;

湖南长沙 410073;

国防科学技术大学电子科学与工程学院;

湖南长沙 410073;

展开▼
原文格式 PDF
正文语种 chi
中图分类程序设计、软件工程;
关键词
云计算; MapReduce; k-近邻连接; 空间查询; R-树;

相似文献

中文文献
外文文献
专利

1. MapReduce框架下基于R-树的K-近邻连接算法设计 [J] . 吴丽鑫 ,闫思宇 . 数字技术与应用 . 2015,第007期
2. MapReduce框架下基于R-树的K-近邻连接算法设计 [J] . 吴丽鑫 ,闫思宇 . 数字技术与应用 . 2015,第007期
3. 基于k-近邻算法与决策树的数据流分类算法 [J] . 朱俚治 . 电脑编程技巧与维护 . 2015,第010期
4. 基于R-树的连续最近邻查询算法优化研究 [J] . 刘彬 ,万静 . 信息技术 . 2008,第001期
5. 基于CUDA的并行K-近邻连接算法实现 [J] . 潘茜 ,张育平 ,陈海燕 . 计算机科学 . 2016,第010期
6. 基于K-近邻单亲遗传算法的多机器人任务分配 [C] . 左贵朋 ,高庆吉 ,董慧芬 . 2010中国制导、导航与控制学术会议 . 2010
7. 基于R-树的最近邻查询研究 [A] . 韩冬柏 . 2011

MapReduce框架下基于R-树的k-近邻连接算法

摘要

著录项

相似文献

相关主题

期刊订阅