首页> 外国专利> Method for shortening the resynchronization time following failure in a computer system utilizing separate servers for redundancy

Method for shortening the resynchronization time following failure in a computer system utilizing separate servers for redundancy

机译:在利用单独的服务器进行冗余的计算机系统中,在故障后缩短重新同步时间的方法

摘要

An apparatus for and method of enhancing reliability within a cluster lock processing system having a relatively large number of commodity cluster instruction processors which are managed by a cluster lock manager. Because the commodity processors have virtually no system viability features such as memory protection, failure recovery, etc., the cluster/lock processors assume the responsibility for providing these functions. The low cost of the commodity cluster instruction processors makes the system almost linearly scalable. The cluster/locking, caching, and mass storage accessing functions are fully integrated into a single hardware platform which performs the role of the cluster/lock master. Upon failure of this hardware platform, a second redundant hardware platform converts from slave to master role. The logic for the failure detection and role swapping is placed within software, which can run as an application under a commonly available operating system. Furthermore, the recovery is completely accomplished without assistance of the Host computer(s) or ultimate user(s) coupled via the Host computer(s). Following repair of the failed server, it is restarted in an orderly fashion to resume a slave role. For the server to be completely restored, coherent memory must be copied from master to slave. Because cluster lock processing must be paused throughout the system to transfer the copy, it is important to minimize the transfer time to minimize the impact on system throughput.
机译:一种用于增强集群锁处理系统中可靠性的设备和方法,该系统具有由集群锁管理器管理的相对大量的商品集群指令处理器。因为商用处理器实际上没有诸如内存保护,故障恢复等之类的系统可行性功能,所以群集/锁处理器承担提供这些功能的责任。商品集群指令处理器的低成本使系统几乎可以线性扩展。群集/锁定,缓存和大容量存储访问功能已完全集成到单个硬件平台中,该硬件平台执行群集/锁定主机的角色。该硬件平台出现故障时,第二个冗余硬件平台将从从角色转换为主角色。故障检测和角色交换的逻辑位于软件中,该软件可以作为应用程序在常见的操作系统下运行。此外,在没有主机或经由主机耦合的最终用户的协助下,恢复完全完成。修复发生故障的服务器后,将按有序方式重新启动它以恢复从属角色。为了完全还原服务器,必须将一致性内存从主服务器复制到从属服务器。因为必须在整个系统中暂停群集锁定处理才能传输副本,所以最小化传输时间以最小化对系统吞吐量的影响很重要。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号