首页> 外文会议>International Conference on Emerging Technologies >Towards Optimal Fault Tolerant Scheduling in Computational Grid
【24h】

Towards Optimal Fault Tolerant Scheduling in Computational Grid

机译:在计算网格中实现最佳容错调度

获取原文

摘要

Grid environment has significant challenges due to diverse failures encountered during job execution. Computational grids provide the main execution platform for long running jobs. Such jobs require long commitment of grid resources. Therefore fault tolerance in such an environment cannot be ignored. Most of the grid middleware have either ignored failure issues or have developed adhoc solutions. Most of the existing fault tolerance techniques are application dependant and causes cognitive problem. This paper examines existing fault detection and tolerance techniques in various middleware. We have proposed fault tolerant layered grid architecture with cross-layered design. In our approach Hybrid Particle Swarm Optimization (HPSO) algorithm and Anycast technique are used in conjunction with the Globus middleware. We have adopted a proactive and reactive fault management strategy for centralized and distributed environments. The proposed strategy is helpful in identifying root cause of failures and resolving cognitive problem. Our strategy minimizes computation and communication thus achieving higher reliability. Anycast limits the effect of Denial of Service/Distributed Denial of Service D (DoS) attacks nearest to the source of the attack thus achieving better security. Significant performance improvement is achieved through using Anycast before HPSO. The selection of more reliable nodes results in less overhead of checkpointing.
机译:由于工作执行期间遇到的不同故障,网格环境具有重大挑战。计算网格为长时间运行的作业提供主执行平台。此类工作需要很长时间承诺网格资源。因此,在这种环境中的容错不能忽视。大多数网格中间件都忽略了故障问题或开发了adhoc解决方案。大多数现有的容错技术都是依赖于应用的,并导致认知问题。本文介绍了各种中间件中的现有故障检测和公差技术。我们提出了具有交叉层叠设计的容错分层网格架构。在我们的方法中,混合粒子群优化(HPSO)算法和AnyCast技术与Globus中间件结合使用。我们采用了集中和分布式环境的主动和无功的故障管理策略。拟议的策略有助于识别失败的根本原因和解决认知问题。我们的策略最小化了计算和通信,从而实现了更高的可靠性。符号限制了拒绝服务/分布式拒绝的效果 - 攻击源最近的服务D(DOS)攻击,从而实现了更好的安全性。通过在HPSO之前使用任意频率来实现显着的性能改进。选择更可靠的节点导致检查点的开销较少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号