SHAD: The Scalable High-Performance Algorithms and Data-Structures Library

机译：SHAD：可扩展的高性能算法和数据结构库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The unprecedented amount of data that needs to be processed in emerging data analytics applications poses novel challenges to industry and academia. Scalability and high performance become more than a desirable feature because, due to the scale and the nature of the problems, they draw the line between what is achievable and what is unfeasible. In this paper, we propose SHAD, the Scalable High-performance Algorithms and Data-structures library. SHAD adopts a modular design that confines low level details and promotes reuse. SHAD's core is built on an Abstract Runtime Interface which enhances portability and identifies the minimal set of features of the underlying system required by the framework. The core library includes common data-structures such as: Array, Vector, Map and Set. These are designed to accommodate significant amount of data which can be accessed in massively parallel environments, and used as building blocks for SHAD extensions, i.e. higher level software libraries. We have validated and evaluated our design with a performance and scalability study of the core components of the library. We have validated the design flexibility by proposing a Graph Library as an example of SHAD extension, which implements two different graph data-structures; we evaluate their performance with a set of graph applications. Experimental results show that the approach is promising in terms of both performance and scalability. On a distributed system with 320 cores, SHAD Arrays are able to sustain a throughput of 65 billion operations per second, while SHAD Maps sustain 1 billion of operations per second. Algorithms implemented using the Graph Library exhibit performance and scalability comparable to a custom solution, but with smaller development effort.

机译：在新兴的数据分析应用程序中需要处理的空前数量的数据给行业和学术界提出了新的挑战。可伸缩性和高性能已不再是理想的功能，因为由于问题的规模和性质，它们在可实现和不可实现之间划清了界限。在本文中，我们提出了SHAD，可扩展的高性能算法和数据结构库。 SHAD采用模块化设计，可限制低级细节并促进重用。 SHAD的核心基于抽象运行时界面，该界面增强了可移植性，并标识了框架所需的基础系统的最少功能集。核心库包括常见的数据结构，例如：数组，向量，地图和集合。这些文件旨在容纳可在大规模并行环境中访问的大量数据，并用作SHAD扩展（即高级软件库）的构建块。我们已经对库的核心组件进行了性能和可伸缩性研究，从而验证并评估了我们的设计。我们通过提出一个图形库作为SHAD扩展的示例来验证了设计的灵活性，该图形库实现了两种不同的图形数据结构;我们通过一组图形应用程序评估它们的性能。实验结果表明，该方法在性能和可伸缩性方面都很有希望。在具有320个内核的分布式系统上，SHAD阵列能够维持每秒650亿次操作的吞吐量，而SHAD Maps能够维持每秒10亿次操作。使用图库实现的算法具有与自定义解决方案相当的性能和可伸缩性，但是开发工作量较小。

著录项

来源
《IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing》|2018年|442-451|共10页
会议地点
作者
Vito Giovanni Castellana; Marco Minutoli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Runtime; Libraries; Scalability; Distributed databases; Software; Task analysis; Data structures;

机译：运行时;库;可伸缩性;分布式数据库;软件;任务分析;数据结构;

相似文献

外文文献
中文文献
专利

1. Comparison of Collaborative Filtering Algorithms: Limitations of Current Techniques and Proposals for Scalable, High-Performance Recommender Systems [J] . FIDEL CACHEDA, VICTOR CARNEIRO, DIEGO FERNANDEZ, ACM transactions on the web . 2011,第1期

机译：协作过滤算法的比较：可扩展的高性能推荐系统的当前技术和建议的局限性
2. TOWARD HIGH-PERFORMANCE COMPUTATIONAL CHEMISTRY .1. SCALABLE FOCK MATRIX CONSTRUCTION ALGORITHMS [J] . Foster IT., Wagner AF., Shepard RL., Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 1996,第1期

机译：迈向高性能计算化学1。可伸缩的矩阵构建算法
3. SOL: A library for scalable online learning algorithms [J] . Wu Yue, Hoi Steven C. H., Liu Chenghao, Neurocomputing . 2017,第octa18期

机译：SOL：用于可扩展的在线学习算法的库
4. SHAD: The Scalable High-Performance Algorithms and Data-Structures Library [C] . Vito Giovanni Castellana, Marco Minutoli IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing . 2018

机译：SCH：可扩展的高性能算法和数据结构库
5. High-performance visualization algorithms for large-scale scientific data. [D] . Shen, Han-Wei. 1998

机译：用于大规模科学数据的高性能可视化算法。
6. BEAGLE 3: Improved Performance Scaling and Usability for a High-Performance Computing Library for Statistical Phylogenetics [O] . Daniel L Ayres, Michael P Cummings, Guy Baele, -1

机译：第3点：改进的性能扩展性和可用性用于统计系统遗传学的高性能计算库
7. Enabling complex analysis of large-scale digital collections: humanities research, high-performance computing, and transforming access to British Library digital collections [O] . Melissa Terras, James Baker, James Hetherington, 2017

机译：对大规模数字收集的复杂分析：人文研究，高性能计算，以及转变对英国图书馆数字收藏的访问
8. Advanced Algorithms and High-Performance Testbed for Large-Scale Site Characterization and Subsurface Target Detecting Using Airborne Ground Penetrating SAR [R] . Fijany, Amir, Collier, James B., Citak, Ari 1997

机译：利用机载探地saR进行大规模场地表征和次表面目标检测的先进算法和高性能试验台

SHAD: The Scalable High-Performance Algorithms and Data-Structures Library

摘要

著录项

相似文献

相关主题

期刊订阅