【24h】

TPCx-HS v2: Transforming with Technology Changes

机译:TPCX-HS V2:使用技术变化转换

获取原文

摘要

The TPCx-HS Hadoop benchmark has helped drive competition in the Big Data marketplace and has proven to be a successful industry standard benchmark for Hadoop systems. However, the Big Data landscape has rapidly changed since its initial release in 2014. Key technologies have matured, while new ones have risen to prominence in an effort to keep pace with the exponential expansion of datasets. For example, Hadoop has undergone a much-needed upgrade to the way that scheduling, resource management, and execution occur in Hadoop, while Apache Spark has risen to be the de facto standard for inmemory cluster compute for ETL, Machine Learning, and Data Science Work-loads. Moreover, enterprises are increasingly considering cloud infrastructure for Big Data processing. What has not changed since TPCx-HS was first released is the need for a straightforward, industry standard way in which these current technologies and architectures can be evaluated. In this paper, we introduce TPCx-HS v2 that is designed to address these changes in the Big Data technology land-scape and stress both the hardware and software stacks including the execution engine (MapReduce or Spark) and Hadoop Filesystem API compatible layers for both on-premise and cloud deployments.
机译:TPCX-HS Hadoop Benchmark有助于推动大数据市场的竞争,并已被证明是Hadoop系统的成功行业标准基准。然而,自2014年初始发布以来,大数据景观已迅速发生变化。关键技术已经成熟,而新的技术则努力保持与数据集的指数扩展速度的突出。例如,Hadoop已经过度升级到Hadoop中发生的调度,资源管理和执行方式,而Apache Spark已上升至InMemory集群的De Facto标准,用于ETL,机器学习和数据科学工作负荷。此外,企业越来越多地考虑大数据处理的云基础设施。自TPCX-HS首次发布以来,无法改变,这是需要简单的行业标准方式,可以评估这些当前技术和架构。在本文中,我们介绍了TPCX-HS V2,旨在解决大数据技术土地景观的这些变化,并强调包括执行引擎(MapReduce或Spark)和Hadoop文件系统API兼容层的硬件和软件堆栈内部部署和云部署。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号