Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

Cody Coleman; Daniel Kang; Deepak Narayanan; Luigi Nardi; Tian Zhao; Jian Zhang; Peter Bailis; Kunle Olukotun; Chris Re; Matei Zaharia Stanford DAWN

首页> 外文期刊>Operating systems review >Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

【24h】

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

机译：分析DAWNBench，这是一种达到时间精度的机器学习性能基准

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Researchers have proposed hardware, software, and algorithmic optimizations to improve the computational performance of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision), and can impact the final model's accuracy on unseen data. Due to a lack of standard evaluation criteria that considers these trade-offs, it is difficult to directly compare these optimizations. To address this problem, we recently introduced DAWNBENCH, a benchmark competition focused on end-to-end training time to achieve near-state-of-the-art accuracy on an unseen dataset-a combined metric called time-to-accuracy (TTA). In this work, we analyze the entries from DAWNBench, which received optimized submissions from multiple industrial groups, to investigate the behavior of TTA as a metric as well as trends in the best-performing entries. We show that TTA has a low coefficient of variation and that models optimized for TTA generalize nearly as well as those trained using standard methods. Additionally, even though DAWNBench entries were able to train ImageNet models in under 3 minutes, we find they still underutilize hardware capabilities such as Tensor Cores. Furthermore, we find that distributed entries can spend more than half of their time on communication. We show similar findings with entries to the MLPerf v0.5 benchmark.

机译：研究人员提出了硬件，软件和算法优化，以提高深度学习的计算性能。这些优化中的某些优化可以更快地执行相同的操作（例如，提高GPU时钟速度），而其他许多优化则可以修改训练过程的语义（例如，降低精度），并且会影响最终模型对看不见的数据的准确性。由于缺乏考虑这些折衷的标准评估标准，因此很难直接比较这些优化。为了解决这个问题，我们最近推出了DAWNBENCH，这是一项基准竞赛，其重点是端到端训练时间，以在看不见的数据集上实现近乎最新的准确性-一种称为“精确时间（TTA）”的组合指标）。在这项工作中，我们分析了DAWNBench的条目，DAWNBench收到了来自多个工业集团的优化意见书，以调查TTA的行为作为一项指标以及表现最佳的条目的趋势。我们显示TTA的变异系数很低，针对TTA优化的模型几乎可以概括为使用标准方法训练的模型。此外，即使DAWNBench条目能够在3分钟内训练ImageNet模型，我们仍然发现它们仍未充分利用Tensor Cores等硬件功能。此外，我们发现分布式条目可以将其一半以上的时间花费在通信上。我们通过MLPerf v0.5基准测试的条目显示了类似的发现。

著录项

来源
《Operating systems review》 |2019年第1期|14-25|共12页
作者
Cody Coleman; Daniel Kang; Deepak Narayanan; Luigi Nardi; Tian Zhao; Jian Zhang; Peter Bailis; Kunle Olukotun; Chris Re; Matei Zaharia Stanford DAWN;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark [J] . Cody Coleman, Daniel Kang, Deepak Narayanan, Operating systems review . 2019,第1期

机译：黎明分析，一种准确的机器学习性能基准
2. MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance [J] . Mattson Peter, Tang Hanlin, Wei Gu-Yeon, IEEE Micro . 2020,第2期

机译：MLPERF：机器学习性能的行业标准基准套件
3. Embodied carbon analysis and benchmarking emissions of high and ultra-high strength concrete using machine learning algorithms [J] . Thilakarathna P. S. M., Seo S., Baduge K. S. Kristombu, Journal of Cleaner Production . 2020,第Jul20期

机译：使用机器学习算法实现高和超高强度混凝土的碳分析和基准排放
4. F3: Machine Learning Processors: From High Performance Applications to Architectures and Benchmarking [C] . IEEE International Solid- State Circuits Conference . 2020

机译：F3：机器学习处理器：从高性能应用程序到体系结构和基准测试
5. Benchmarking Statistical and Machine-Learning Methods for Single-Cell RNA Sequencing Data [D] . Xi, Nan. 2021

机译：用于单细胞RNA测序数据的基准测试统计和机器学习方法
6. Nutritional markers of undiagnosed type 2 diabetes in adults: Findings of a machine learning analysis with external validation and benchmarking [O] . Kushan De Silva, Siew Lim, Aya Mousa, 2021

机译：成人未确诊2型糖尿病的营养标志：机器学习分析的发现外部验证和基准测试
7. Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets [O] . Barakat Nahla, Diederich Joachim 2004

机译：支持向量机的基于学习的规则提取：基准数据集的性能

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

摘要

著录项

相似文献

相关主题

期刊订阅