首页> 外文会议>Proceedings of the IASTED international conferences on informatics >COMPARISON OF TABLE JOIN EXECUTION TIME FOR PARALLEL DBMS AND MAPREDUCE
【24h】

COMPARISON OF TABLE JOIN EXECUTION TIME FOR PARALLEL DBMS AND MAPREDUCE

机译:并行DBMS和MAPREDUCE的表联接执行时间的比较

获取原文
获取原文并翻译 | 示例

摘要

Analysis of existing research work indicates that preference for implementation of queries to structured data is given to parallel DBMS. MapReduce (MR) is perceived as supplementary to DBMS technology. We attempt to figure out behavior pattern of parallel row-storage DBMS and MR system Hadoop on the example of Join task depending on the variation of the parameters that in other authors' experiments do not vary or differ from ours. This article presents detailed process models for table joins in the parallel row-storage DBMS and MR-system, as well as the results of detailed calculation experiments performed on these models. The models were set up for various scalability schemes for MR (number of nodes) and DMBS (data volume in a node) and fragmentation of the joined tables by the primary key. The following parameters were varied: queried data selectivity, number of sorted resulting records and cardinality of the grouping attribute. The modeling results showed that with the increase of the stored data volume parallel DBMS starts losing against MR-system at certain thresholds.
机译:对现有研究工作的分析表明,并行DBMS优先考虑对结构化数据进行查询。 MapReduce(MR)被认为是DBMS技术的补充。我们试图根据Join任务的示例来确定并行行存储DBMS和MR系统Hadoop的行为模式,具体取决于其他作者实验中没有变化或与我们不同的参数变化。本文介绍了并行行存储DBMS和MR系统中表联接的详细过程模型,以及在这些模型上执行的详细计算实验的结果。这些模型是针对MR(节点数)和DMBS(节点中的数据量)以及通过主键对连接表进行分段的各种可伸缩性方案而建立的。更改了以下参数:查询数据的选择性,排序后的结果记录数和分组属性的基数。建模结果表明,随着存储数据量的增加,并行DBMS在某些阈值下开始针对MR系统而丢失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号