Temperature based fault forecasting in computer clusters

机译：计算机集群中基于温度的故障预测

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clusters and Grids have one thing common and that is they both are used to achieve High Performance in Computing. The scope of Cluster is relatively narrow compared to Grid, as Clusters are homogeneous while Grids are heterogeneous. Another emerging area in High Performance Computing (HPC) is Cloud computing that can be considered as a further extension of Grid computing. Apart from other issues that exist in Clusters, Grids and Clouds, there is one common problem or issue that is available in all of them and that is Fault Tolerance and Handling. Fault Tolerance is the technique or the set of techniques that are used when different types of Hardware, Software, Network and other types of problems come during the handling and execution of Clusters, Grids and Clouds. In this research we have focused on fault identification and forecasting from Clusters point of view and have tried to establish a technique that forecasts the faults in Clusters based environments on the basis of temperature. Nodes keep on receiving and monitoring the temperature of the attached devices from temperature sensor and check the temperature threshold values of those devices. If the temperature threshold value of devices is within the range than we place/rate the machine in Green zone. Similarly if temperatures are approaching threshold values then we place the machines in Orange zone that represents that machine may or may not crash on the basis of temperature. Similarly when the devices have crossed the threshold values of the temperature then we place the machine in Red zone that represents that machine is likely to fail due to the failure of one or more hardware devices any time.

机译：集群和网格有一个共同点，那就是它们都用于实现高性能的计算。与网格相比，群集的范围相对狭窄，因为群集是同质的，而网格是异构的。高性能计算（HPC）的另一个新兴领域是云计算，可以将其视为网格计算的进一步扩展。除了集群，网格和云中存在的其他问题之外，还有一个常见的问题或所有问题都可用，即容错和处理。容错是在群集，网格和云的处理和执行期间出现不同类型的硬件，软件，网络和其他类型的问题时使用的一种或多种技术。在这项研究中，我们从集群的角度着眼于故障识别和预测，并试图建立一种基于温度来预测基于集群的环境中的故障的技术。节点继续从温度传感器接收和监视连接的设备的温度，并检查那些设备的温度阈值。如果设备的温度阈值在该范围内，则我们将机器放置/评估为“绿色”区域。同样，如果温度接近阈值，则将机器放置在橙色区域，该区域表示机器可能会或可能不会因温度而崩溃。同样，当设备超过温度阈值时，我们将机器置于红色区域，该区域表示该机器很可能由于任何时候一个或多个硬件设备的故障而发生故障。

著录项

来源
《2012 15th IEEE International Multitopic Conference》|2012年|p.69-77|共9页
会议地点 Islamabad(PK)
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Cluster; Distributed Systems; Fault Forecasting; Fault Tolerance; Grid;

机译：集群;分布式系统;故障预测;容错性;网格;;

相似文献

外文文献
中文文献
专利

1. Fault Diagnostics on Steam Boilers and Forecasting System Based on Hybrid Fuzzy Clustering and Artificial Neural Networks in Early Detection of Chamber Slagging/Fouling [J] . Mohan Sathya Priya, Radhakrishnan Kanthavel, Muthusamy Saravanan Circuits and Systems . 2016,第12期

机译：基于混合模糊聚类和人工神经网络的汽锅预测系统故障诊断。
2. Fault Diagnostics on Steam Boilers and Forecasting System Based on Hybrid Fuzzy Clustering and Artificial Neural Networks in Early Detection of Chamber Slagging/Fouling [J] . Mohan Sathya Priya, Radhakrishnan Kanthavel, Muthusamy Saravanan Circuits and systems . 2016,第12期

机译：基于混合模糊聚类和人工神经网络的汽锅预测系统故障诊断。
3. A clustering-based sales forecasting scheme by using extreme learning machine and ensembling linkage methods with applications to computer server [J] . Chi-Jie Lu, Ling-Jing Kao Engineering Applications of Artificial Intelligence . 2016,第octa期

机译：通过使用极限学习机和将链接方法与计算机服务器上的应用程序结合起来的基于集群的销售预测方案
4. Temperature based fault forecasting in computer clusters [C] . (missing) IEEE International Multitopic Conference . 2012

机译：计算机群中的温度基故障预测
5. Fault Detection Based on Mean-Shift Clustering and Immune Danger Theory. [D] . Carson, David. 2016

机译：基于均值漂移聚类和免疫危险理论的故障检测
6. Dissolved Gases Forecasting Based on Wavelet Least Squares Support Vector Regression and Imperialist Competition Algorithm for Assessing Incipient Faults of Transformer Polymer Insulation [O] . Jiefeng Liu, Hanbo Zheng, Yiyi Zhang, 2019

机译：基于小波最小二乘支持向量回归和帝国竞争算法的变压器聚合物绝缘初发故障溶解气体预测。
7. Fault Diagnosis and Forecast of Substation Equipment Temperature Based on Fuzzy C Means Clustering Algorithm [O] . Jun Li, Jiangwen Xiao, Jing Wu, 2015

机译：基于模糊C算法聚类算法的变电站设备温度故障诊断及预测

Temperature based fault forecasting in computer clusters

摘要

著录项

相似文献

相关主题

期刊订阅