首页> 外文会议>IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing >SHAD: The Scalable High-Performance Algorithms and Data-Structures Library
【24h】

SHAD: The Scalable High-Performance Algorithms and Data-Structures Library

机译:SHAD:可扩展的高性能算法和数据结构库

获取原文

摘要

The unprecedented amount of data that needs to be processed in emerging data analytics applications poses novel challenges to industry and academia. Scalability and high performance become more than a desirable feature because, due to the scale and the nature of the problems, they draw the line between what is achievable and what is unfeasible. In this paper, we propose SHAD, the Scalable High-performance Algorithms and Data-structures library. SHAD adopts a modular design that confines low level details and promotes reuse. SHAD's core is built on an Abstract Runtime Interface which enhances portability and identifies the minimal set of features of the underlying system required by the framework. The core library includes common data-structures such as: Array, Vector, Map and Set. These are designed to accommodate significant amount of data which can be accessed in massively parallel environments, and used as building blocks for SHAD extensions, i.e. higher level software libraries. We have validated and evaluated our design with a performance and scalability study of the core components of the library. We have validated the design flexibility by proposing a Graph Library as an example of SHAD extension, which implements two different graph data-structures; we evaluate their performance with a set of graph applications. Experimental results show that the approach is promising in terms of both performance and scalability. On a distributed system with 320 cores, SHAD Arrays are able to sustain a throughput of 65 billion operations per second, while SHAD Maps sustain 1 billion of operations per second. Algorithms implemented using the Graph Library exhibit performance and scalability comparable to a custom solution, but with smaller development effort.
机译:在新兴的数据分析应用程序中需要处理的空前数量的数据给行业和学术界提出了新的挑战。可伸缩性和高性能已不再是理想的功能,因为由于问题的规模和性质,它们在可实现和不可实现之间划清了界限。在本文中,我们提出了SHAD,可扩展的高性能算法和数据结构库。 SHAD采用模块化设计,可限制低级细节并促进重用。 SHAD的核心基于抽象运行时界面,该界面增强了可移植性,并标识了框架所需的基础系统的最少功能集。核心库包括常见的数据结构,例如:数组,向量,地图和集合。这些文件旨在容纳可在大规模并行环境中访问的大量数据,并用作SHAD扩展(即高级软件库)的构建块。我们已经对库的核心组件进行了性能和可伸缩性研究,从而验证并评估了我们的设计。我们通过提出一个图形库作为SHAD扩展的示例来验证了设计的灵活性,该图形库实现了两种不同的图形数据结构;我们通过一组图形应用程序评估它们的性能。实验结果表明,该方法在性能和可伸缩性方面都很有希望。在具有320个内核的分布式系统上,SHAD阵列能够维持每秒650亿次操作的吞吐量,而SHAD Maps能够维持每秒10亿次操作。使用图库实现的算法具有与自定义解决方案相当的性能和可伸缩性,但是开发工作量较小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号