首页> 外文会议> >A Cluster-Based Implementation of a Fault Tolerant Parallel Reduction Algorithm Using Swarm-Array Computing
【24h】

A Cluster-Based Implementation of a Fault Tolerant Parallel Reduction Algorithm Using Swarm-Array Computing

机译:基于集群的群体算法并行容错并行约简算法

获取原文

摘要

Recent research in multi-agent systems incorporate fault tolerance concepts. However, the research does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely 'Intelligent Agents'. In the approach considered a task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of apredicted core/processor failure and for successfully completing the task. The agents hence contribute towards fault tolerance and towards building reliable systems. The feasibility of the approach is validated by simulations on an FPGA usinga multi-agent simulator and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.
机译:多代理系统中的最新研究纳入了容错概念。但是,该研究并未探讨此类思想在大规模并行计算系统中的扩展和实现。本文报道的工作研究了群体阵列计算方法,即“智能代理”。在考虑的方法中,将在并行计算系统上执行的任务分解为子任务,并映射到遍历抽象硬件层的代理上。代理程序在处理器之间相互通信,以在预计的核心/处理器故障期间共享信息,并成功完成任务。因此,代理有助于容错和建立可靠的系统。通过使用多智能体模拟器在FPGA上进行仿真,并使用消息传递接口在计算机集群上实现并行约简算法,验证了该方法的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号