【24h】

Proactive Fault Monitoring in Enterprise Servers

机译:企业服务器中的主动故障监视

获取原文
获取原文并翻译 | 示例

摘要

New proactive fault monitoring innovations are being developed, demonstrated on executing servers, and producttedfor enhancing the reliability, availability, and serviceability of enterprise-class servers. A continuous system telemetry harness (CSTH) has been developed that collects time series signals relating to the health of dynamically executing servers. These time series provide quantitative metrics associated with physical variables (distributed temperatures, voltages, and currents throughout the system), "soft"performance variables (loads, throughputs, queue lengths, bit error rates, etc.), and various quality-of-service (QoS) metrics. The CSTH signals are continuously archived to an offline circular file (i.e. the "Black Box Flight Recorder") that is helping to identify and eliminate cosily sources of No-Trouble-Founds (NTFs) in Sun systems; and the signals are concurrently processed in real time using advanced pattern recognition for proactive anomaly detection. Examples are presented of the uses of the CSTH coupled with pattern recognition for high-sensitivity predictive failure analysis that is helping to increase component and system availability goals while decreasing the incidence of "No Trouble Found" (NTF) events that have become a costly serviceability/warranty issue in the enterprise computing industry.
机译:正在开发新的主动故障监视创新,在执行服务器上进行演示,并生产用于增强企业级服务器的可靠性,可用性和可维护性的产品。已经开发出一种连续系统遥测装置(CSTH),它可以收集与动态执行服务器的运行状况有关的时间序列信号。这些时间序列提供了与物理变量(系统中的分布式温度,电压和电流),“软”性能变量(负载,吞吐量,队列长度,误码率等)以及各种质量服务(QoS)指标。 CSTH信号连续归档到一个离线循环文件(即“黑匣子飞行记录器”)中,该文件有助于识别和消除Sun系统中的无故障根源(NTF)。并使用高级模式识别实时并发地处理信号,以主动检测异常。举例说明了CSTH与模式识别一起用于高灵敏度预测性故障分析的用途,该方法有助于提高组件和系统的可用性目标,同时减少已成为昂贵的可维护性的“无故障”(NTF)事件的发生率/企业计算行业中的保修问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号