Demystifying the MLPerf Training Benchmark Suite

机译：揭秘MLPerf培训基准套件

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

MLPerf, an emerging machine learning benchmark suite, strives to cover a broad range of machine learning applications. We present a study on the characteristics of MLPerf benchmarks and how they differ from previous deep learning benchmarks such as DAWNBench and DeepBench. MLPerf benchmarks are seen to exhibit moderately high memory transactions per second and moderately high compute rates, while DAWNBench creates a high-compute benchmark with low memory transaction rate, and DeepBench provides low compute rate benchmarks. We also observe that the various MLPerf benchmarks possess unique features that allow unveiling various bottlenecks in systems. We also observe variation in scaling efficiency across the MLPerf models. The variation exhibited by the different models highlight the importance of smart scheduling strategies for multi-GPU training. Another observation is that dedicated low latency interconnect between GPUs in multi-GPU systems is crucial for optimal distributed deep learning training. Furthermore, host CPU utilization increases with an increase in the number of GPUs used for training. Corroborating prior work, we also observe and quantify improvements possible by mixed-precision training using Tensor Cores.

机译：MLPerf是一个新兴的机器学习基准套件，致力于涵盖各种机器学习应用程序。我们对MLPerf基准的特征以及它们与以前的深度学习基准（例如DAWNBench和DeepBench）的区别进行了研究。可以看到MLPerf基准测试表现出每秒较高的内存事务处理和较高的计算速率，而DAWNBench创建了具有较低内存事务处理率的高性能计算基准，而DeepBench提供了较低的计算速率基准。我们还观察到，各种MLPerf基准测试具有独特的功能，可以揭示系统中的各种瓶颈。我们还观察到了MLPerf模型中缩放效率的变化。不同模型展示的差异突出显示了智能调度策略对多GPU训练的重要性。另一个观察结果是，多GPU系统中GPU之间专用的低延迟互连对于优化分布式深度学习培训至关重要。此外，随着用于训练的GPU数量的增加，主机CPU利用率也随之提高。为了证实先前的工作，我们还观察和量化了使用Tensor Cores进行的混合精度培训可能带来的改进。

著录项

来源
《IEEE International Symposium on Performance Analysis of Systems and Software》|2020年|24-33|共10页
会议地点
作者
Snehil Verma; Qinzhe Wu; Bagus Hanindhito; Gunjan Jha; Eugene B. John; Ramesh Radhakrishnan; Lizy K. John;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Benchmarking; Machine Learning; Training;

机译：标杆管理;机器学习;培训;

相似文献

外文文献
中文文献
专利

1. MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance [J] . Mattson Peter, Tang Hanlin, Wei Gu-Yeon, IEEE Micro . 2020,第2期

机译：MLPERF：机器学习性能的行业标准基准套件
2. TRAINING COMPETITORS BATTLE ON MLPERF: First Official Benchmarks for Graphcore, Habana, Huawei Accelerators [J] . Linley Gwennap Microprocessor report . 2021,第7期

机译：培训竞争对手在Mlperf上的战斗：Graphcore，Hawa，华为加速器的第一个官方基准
3. Benchmarking Deep Neural Network Inference Performance on Serverless Environments With MLPerf [J] . Elordi Unai, Unzueta Luis, Goenetxea Jon, IEEE Software . 2021,第1期

机译：使用MLPERF对无服务环境的深度神经网络推理性能
4. MLPerf Inference Benchmark [C] . Vijay Janapa Reddi, Christine Cheng, David Kanter, ACM/IEEE Annual International Symposium on Computer Architecture . 2020

机译：MLPerf推断基准
5. Evaluation of a benchmark suite exposing Android system complexities using region-based caching. [D] . Brown, Martin Kenneth. 2016

机译：使用基于区域的缓存评估暴露Android系统复杂性的基准套件。
6. PMLB: a large benchmark suite for machine learning evaluation and comparison [O] . Randal S. Olson, William La Cava, Patryk Orzechowski, 2017

机译：PMLB：用于机器学习评估和比较的大型基准套件
7. MLPerf Inference Benchmark [O] . Vijay Janapa Reddi, Christine Cheng, David Kanter, 2020

机译：MLPERF推理基准
8. Common Ada (tradename) Missile Packages (CAMP) Project. Benchmark Suite. Volume 2. Inventory of the CAMP Benchmark Suite. [R] . McNicholl, D., Cohen, S., Palmer, C. 1988

机译：Common ada（商品名）导弹包裹（Camp）项目。基准套件。第2卷.Camp基准套件的库存。

Demystifying the MLPerf Training Benchmark Suite

摘要

著录项

相似文献

相关主题

期刊订阅