Towards Selecting Best Combination of SQL-on-Hadoop Systems and JVMs

机译：努力选择SQL-on-Hadoop系统和JVM的最佳组合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While Hadoop is the de facto standard big-data middleware, many frameworks have been developed on top of it. Since many SQL-on-Hadoop systems are available, we often consider which engine is best for our queries. We can choose not only query engines but also Java virtual machines (JVMs) as well. As their systems become more complex, however, it is not always true that a single system performs best at any time. Moreover, the performance of a mismatched system may degrade greatly. To exploit the best performance, it is important to know what type of queries are suitable for a system and then to schedule queries for the appropriate system. In this paper, we evaluated the TPC-DS benchmark on a combination of query engines (Spark and Tez) and JVMs (J9 and OpenJDK). We found that using different engines lead to a drawback of over 10 times and that using different JVMs leads to a drawback of 3 times. We also analyzed the characteristics of each combination and then proposed classification models for selecting the best combination of systems with a generated query plan. As a result, we achieved a performance improvement of up to two times in total with the classifier.

机译：尽管Hadoop是事实上的标准大数据中间件，但已经在它之上开发了许多框架。由于有许多SQL-on-Hadoop系统可用，因此我们经常考虑哪种引擎最适合我们的查询。我们不仅可以选择查询引擎，还可以选择Java虚拟机（JVM）。但是，随着他们的系统变得越来越复杂，单个系统在任何时候都表现最佳并非总是如此。此外，不匹配的系统的性能可能会大大降低。为了利用最佳性能，重要的是要知道哪种查询类型适用于系统，然后为适当的系统安排查询。在本文中，我们结合查询引擎（Spark和Tez）和JVM（J9和OpenJDK）对TPC-DS基准进行了评估。我们发现使用不同的引擎导致的缺陷超过10倍，而使用不同的JVM导致的缺陷超过3倍。我们还分析了每种组合的特征，然后提出了分类模型，以选择具有生成的查询计划的系统的最佳组合。结果，我们通过分类器总共将性能提高了两倍。

著录项

来源
《IEEE International Conference on Cloud Computing》|2018年|245-252|共8页
会议地点
作者
Tatsuhiro Chiba; Takeshi Yoshimura; Michihiro Horie; Hiroshi Horii;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Sparks; Engines; Java; Biological system modeling; Runtime; Query processing; Structured Query Language;

机译：Sparks;引擎; Java;生物系统建模;运行时;查询处理;结构化查询语言;

相似文献

外文文献
中文文献
专利

1. Tailor-made JVMs for statically configured embedded systems [J] . Michael Stilkerich, Isabella Thomm, Christian Wawersich, Concurrency and computation: practice and experience . 2012,第8a10期

机译：为静态配置的嵌入式系统量身定制的JVM
2. A JVM for Soft-Error-Prone Embedded Systems [J] . Isabella Stilkerich, Michael Strotz, Christoph Erhardt, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2013,第5期

机译：适用于软错误错误嵌入式系统的JVM
3. Multi-Layer Real-Time Support for JVM-based Smart Phone Systems [J] . WOO Y., LIM D., JUNG Y., Advances in Electrical and Computer Engineering . 2013,第3期

机译：基于JVM的智能电话系统的多层实时支持
4. Towards Selecting Best Combination of SQL-on-Hadoop Systems and JVMs [C] . Tatsuhiro Chiba, Takeshi Yoshimura, Michihiro Horie, IEEE International Conference on Cloud Computing . 2018

机译：选择SQL-On-Hadoop系统和JVM的最佳组合
5. Don't Get Caught in the Cold, Warm-Up Your JVM: Understand and Eliminate JVM Warm-Up Overhead in Data-Parallel Systems [D] . Lion, David. 2017

机译：不要陷入寒冷，热身您的JVM：了解并消除数据并行系统中的JVM热身开销
6. A combination of selected mapping and clipping to increase energy efficiency of OFDM systems [O] . Byung Moo Lee, You Seung Rim, Wonjong Noh -1

机译：所选映射和裁剪的组合可提高OFDM系统的能效
7. Trace fragment selection within methodbased JVMs [O] . Duane Merrill, Kim Hazelwood 2008

机译：基于方法的JVm中的跟踪片段选择
8. A Simulation Analysis for Ranking and Selecting the Best Combination of Production Planning and Accounting Control Systems [R] . Lin, W. T., Chen, H. J., Dudewicz, E. J. 1982

机译：生产计划与会计控制系统最佳组合排序与选择的仿真分析

Towards Selecting Best Combination of SQL-on-Hadoop Systems and JVMs

摘要

著录项

相似文献

相关主题

期刊订阅