首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Efficient Parallel Skyline Evaluation Using MapReduce
【24h】

Efficient Parallel Skyline Evaluation Using MapReduce

机译:使用MapReduce的高​​效并行天际线评估

获取原文
获取原文并翻译 | 示例

摘要

This research develops an advanced two-phase MapReduce solution that is able to efficiently address skyline queries on large datasets. Unlike existing parallel skyline approaches, our scheme considers data partitioning, filtering, and parallel skyline evaluation as a holistic query process. In particular, we apply filtering techniques and angle-based partitioning in the first phase, in which unqualified objects are discarded and the processed objects are partitioned by their angles to the origin.In the second phase, local skyline objects in each partition are calculated in parallel, and global skyline objects are output after a merging skyline process. To improve the parallel local skyline calculation, we propose two partition-aware filtering methods that keep skyline candidates in a balanced manner. The aggressive partition-aware filtering aggressively eliminates objects in the partition with the greatest population of candidate objects, whereas the proportional partition-aware filtering slows down the growth of partition population proportionally.Recognizing the lack of studies that incorporate the MapReduce framework into parallel skyline processing, we propose a partial-presort grid-based partition skyline algorithm that is able to significantly improve the merging skyline computation on large datasets. The presort process can be completed in the shuffle phase with little overhead. Our experimental results show the efficiency and effectiveness of the proposed parallel skyline solution utilizing MapReduce on large-scale datasets.
机译:这项研究开发了一种先进的两阶段MapReduce解决方案,该解决方案能够有效解决大型数据集上的天际线查询。与现有的并行天际线方法不同,我们的方案将数据分区,过滤和并行天际线评估视为一个整体查询过程。特别是,我们在第一阶段应用过滤技术和基于角度的分区,其中丢弃不合格的对象,然后按与原点的角度对处理后的对象进行分区。第二阶段,在每个分区中计算局部天际线对象合并,然后合并天际线过程后输出全局天际线对象。为了改善并行的本地天际线计算,我们提出了两种分区感知过滤方法,可以使天际线候选者保持平衡。主动的分区感知过滤可主动消除分区中具有最多候选对象的对象,而按比例的分区感知过滤则会按比例减慢分区人口的增长。认识到缺乏将MapReduce框架纳入并行天际线处理的研究,我们提出了一种基于局部预排序网格的分区天际线算法,该算法能够显着改善大型数据集的合并天际线计算。可以在混洗阶段以很少的开销完成预排序过程。我们的实验结果表明,在大型数据集上利用MapReduce提出的并行天际线解决方案的效率和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号