首页> 外文会议>2012 International Green Computing Conference. >Leveraging thermal dynamics in sensor placement for overheating server component detection
【24h】

Leveraging thermal dynamics in sensor placement for overheating server component detection

机译:利用传感器放置中的热力学来检测服务器组件是否过热

获取原文
获取原文并翻译 | 示例

摘要

Server overheating has become a well-known issue in today's data centers that host a large number of high-density servers. The current practice of server overheating detection is to monitor the server inlet temperature with the temperature sensor on the server enclosure, or the CPU temperature with on-die thermal sensors. However, this is in contrast to the fact that different components in a server may have different overheating thresholds, which are closely related to their respective thermal failure rates and expected lifetimes. Moreover, the thermal correlation between the inlet (or CPU) and other server components can be different for every server model. As a result, relying on the single inlet or CPU temperature for server overheating detection is over-simplistic, which may lead to either degraded detection performance or false alarms that can result in excessive cooling power, leading to unnecessarily low inlet temperature. In this paper, we propose a model-based approach that leverages thermal dynamics to intelligently choose sensor placement locations for precise overheating server component detection. We first formulate the detection problem as a constrained optimization problem. We then adopt Computational Fluid Dynamics (CFD) to establish the thermal model and analyze the thermal status of the server enclosure under various overheating conditions, such as inlet overheating, fan failures and CPU overloading. Based on the CFD analysis, we apply data fusion and advanced optimization techniques to find a near-optimal solution for sensor placement locations, such that the probability of detecting different overheating components is significantly improved. Our empirical results on a real rack server testbed demonstrate the detection performance of our solution. Extensive simulation results also show that the proposed solution outperforms other commonly used overheating monitoring solutions in terms of detection probability and error rate.
机译:在承载大量高密度服务器的当今数据中心中,服务器过热已成为众所周知的问题。服务器过热检测的当前做法是使用服务器机柜上的温度传感器监视服务器入口温度,或使用裸片上的热传感器监视CPU温度。但是,这与服务器中的不同组件可能具有不同的过热阈值这一事实相反,该阈值与其各自的热故障率和预期寿命密切相关。此外,对于每种服务器型号,入口(或CPU)与其他服务器组件之间的热相关性可能会有所不同。结果,依靠单个入口或CPU温度进行服务器过热检测过于简单,这可能导致检测性能下降或错误警报,从而导致过多的冷却功率,从而导致不必要的较低入口温度。在本文中,我们提出了一种基于模型的方法,该方法利用热动力学来智能地选择传感器放置位置,以进行精确的过热服务器组件检测。我们首先将检测问题表述为约束优化问题。然后,我们采用计算流体动力学(CFD)来建立热模型,并分析服务器机箱在各种过热条件下的热状态,例如入口过热,风扇故障和CPU过载。基于CFD分析,我们应用数据融合和先进的优化技术来找到传感器放置位置的最佳解决方案,从而大大提高了检测不同过热组件的可能性。我们在真实机架服务器测试床上的经验结果证明了我们解决方案的检测性能。大量的仿真结果还表明,在检测概率和错误率方面,所提出的解决方案优于其他常用的过热监控解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号