首页> 外文会议>International Workshop on Embedded Multicore Systems >An Efficient Filter Strategy for Theta-Join Query in Distributed Environment
【24h】

An Efficient Filter Strategy for Theta-Join Query in Distributed Environment

机译:分布式环境中的Theta-Join查询的有效筛选策略

获取原文

摘要

Theta-join query is a very popular application in traditional databases, but due to tremendous computation cost and communication cost in distributed environment, it is not efficiently processed for big data. Current researches focus on processing theta-join by using MapReduce framework, which mainly consider the overheads of load balance in the network, when the data sets become larger, massive intermediate results lead to high communication cost. In this work, we propose a filter method for theta-join to reduce the computation and communication cost in distributed environment, which can effectively improve the theta-join query. We consider both the load balance in the cluster and the memory cost in the parallel framework. We have implemented our method in a popular general-purpose data processing framework, Spark. The experimental results demonstrate that our method can significantly improve the performance of theta-joins comparing the state-of-art solutions.
机译:Theta-Join查询是传统数据库中的一个非常受欢迎的应用程序,但由于分布式环境中的巨大的计算成本和通信成本,因此无法为大数据有效处理。目前研究专注于使用MapReduce框架处理Theta-Join,主要考虑网络中的负载平衡的开销,当数据集变大时,大量的中间结果导致高通信成本。在这项工作中,我们提出了一种滤波器方法,用于减少分布式环境中的计算和通信成本,可以有效地改善Theta-Join查询。我们考虑群集中的负载平衡以及并行框架中的内存成本。我们在流行的通用数据处理框架中实现了我们的方法,火花。实验结果表明,我们的方法可以显着提高与最先进的解决方案进行比较的θ-joins的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号