首页> 外文会议>IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems >CFD Simulation and Optimization of the Cooling of Open Compute Machine Learning “Big Sur” Server
【24h】

CFD Simulation and Optimization of the Cooling of Open Compute Machine Learning “Big Sur” Server

机译:开放式计算机学习“大苏尔”服务器散热的CFD模拟和优化

获取原文
获取外文期刊封面目录资料

摘要

In recent years, there have been phenomenal increases in Artificial Intelligence and Machine Learning that require data collection, mining and using data sets to teach computers certain things to learn, analyze image and speech recognition. Machine Learning tasks require a lot of computing power to carry out numerous calculations. Therefore, most servers are powered by Graphics Processing Units (GPUs) instead of traditional CPUs. GPUs provide more computational throughput per dollar spent than traditional CPUs. Open Compute Servers forum has introduced the state-of-the-art machine learning servers “Big Sur” recently. Big Sur unit consists of 4OU (OpenU) chassis housing eight NVidia Tesla M40 GPUs and two CPUs along with SSD storage and hot-swappable fans at the rear. Management of the airflow is a critical requirement in the implementation of air cooling for rack mount servers to ensure that all components, especially critical devices such as CPUs and GPUs, receive adequate flow as per requirement. In addition, component locations within the chassis play a vital role in the passage of airflow and affect the overall system resistance. In this paper, sizeable improvement in chassis ducting is targeted to counteract effects of air diffusion at the rear of air flow duct in “Big Sur” Open Compute machine learning server wherein GPUs are located directly downstream from CPUs. A CFD simulation of the detailed server model is performed with the objective of understanding the effect of air flow bypass on GPU die temperatures and fan power consumption. The cumulative effect was studied by simulations to see improvements in fan power consumption by the server. The reduction in acoustics noise levels caused by server fans is also discussed.
机译:近年来,人工智能和机器学习的迅猛发展,要求数据收集,挖掘和使用数据集来教计算机某些东西以学习,分析图像和语音识别。机器学习任务需要大量的计算能力才能执行大量的计算。因此,大多数服务器由图形处理单元(GPU)而非传统CPU供电。与传统的CPU相比,GPU每花费1美元可提供更高的计算吞吐量。 Open Compute Servers论坛最近推出了最新的机器学习服务器“ Big Sur”。 Big Sur单元由4OU(OpenU)机箱组成,其中装有8个NVidia Tesla M40 GPU和2个CPU,以及背面的SSD存储和热插拔风扇。气流管理是对机架式服务器实施空气冷却的一项关键要求,以确保所有组件(尤其是诸如CPU和GPU等关键设备)均能获得足够的流量。此外,机箱内的组件位置在气流通过中起着至关重要的作用,并影响整个系统的阻力。在本文中,机箱管道的大幅改进旨在抵消“ Big Sur”开放式计算机学习服务器中气流管道后部空气扩散的影响,其中GPU直接位于CPU的下游。进行详细服务器模型的CFD仿真是为了了解气流绕过对GPU芯片温度和风扇功耗的影响。通过模拟研究了累积效果,以查看服务器在风扇功耗方面的改进。还讨论了由服务器风扇引起的声学噪声水平的降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号