首页> 外文会议>Euromicro International Conference on Parallel, Distributed and Network-Based Processing >IO Challenges for Human Brain Atlasing Using Deep Learning Methods - An In-Depth Analysis
【24h】

IO Challenges for Human Brain Atlasing Using Deep Learning Methods - An In-Depth Analysis

机译:利用深度学习方法对人脑建立的IO挑战 - 深入分析

获取原文

摘要

The use of Deep Learning methods have been identified as a key opportunity for enabling processing of extreme-scale scientific datasets. Feeding data into compute nodes equipped with several high-end GPUs at sufficiently high rate is a known challenge. Facilitating processing of these datasets thus requires the ability to store petabytes of data as well as to access the data with very high bandwidth. In this work, we look at two Deep Learning use cases for cytoarchitectonic brain mapping. These applications are very challenging for the underlying IO system. We present an in depth analysis of their IO requirements and performance. Both applications are limited by the IO performance, as the training processes often have to wait several seconds for new training data. Both applications read random patches from a collection of large HDF5 datasets or TIFF files, which result in many small non-consecutive accesses to the parallel file systems. By using a chunked data format or storing temporally copies of the required patches, the IO performance can be improved significantly. These leads to a decrease of the total runtime of up to 80%.
机译:已经将深度学习方法的使用被确定为使能够处理极度科学数据集的关键机会。将数据送入配备有几个高端GPU的计算节点,其具有足够高的速率是已知的挑战。因此,促进这些数据集的处理需要能够存储PB的数据以及使用非常高的带宽来访问数据。在这项工作中,我们看一下CytoArchitional脑映射的两个深入学习用例。这些应用程序对底层IO系统非常具有挑战性。我们深入分析了他们的IO要求和性能。这两个应用程序受到IO性能的限制,因为培训过程通常必须等待新的培训数据。这两个应用程序都读取了来自大型HDF5数据集或TIFF文件集合的随机修补程序,从而导致许多小的非连续访问并行文件系统。通过使用块状数据格式或存储所需修补程序的时间副本,可以显着提高IO性能。这些导致达到80±80%的总运行时间减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号