首页> 外文会议>2012 Third International Conference on Networking and Computing. >An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem
【24h】

An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem

机译:求解旅行商问题的蚁群优化的高效GPU实现

获取原文
获取原文并翻译 | 示例

摘要

Graphics Processing Units (GPUs) are specialized microprocessors that accelerate graphics operations. Recent GPUs, which have many processing units connected with an off-chip global memory, can be used for general purpose parallel computation. Ant Colony Optimization (ACO) approaches have been introduced as ature-inspired heuristics to find good solutions of the Traveling Salesman Problem (TSP). In ACO approaches, a number of ants traverse the cities of the TSP to find better solutions of the TSP. The ants randomly select next visiting cities based on the probabilities determined by total amounts of their pheromone spread on routes. The main contribution of this paper is to present sophisticated and efficient implementation of one of the ACO approaches on the GPU. In our implementation, we have considered many programming issues of the GPU architecture including coalesced access of global memory, shared memory bank conflicts, etc. In particular, we present a very efficient method for random selection of next cities by a number of ants. Our new method uses iterative random trial which can find next cities in few computational costs with high probability. The experimental results on NVIDIA GeForce GTX 580 show that our implementation for 1002 cities runs in 8.71 seconds, while a conventional CPU implementation runs in 381.95 seconds. Thus, our GPU implementation attains a speed-up factor of 43.47.
机译:图形处理单元(GPU)是专用的微处理器,可以加速图形操作。具有与片外全局存储器连接的许多处理单元的最新GPU可用于通用并行计算。引入蚁群优化(ACO)方法作为受自然启发的启发式方法,以找到旅行商问题(TSP)的良好解决方案。在ACO方法中,许多蚂蚁遍历TSP的城市以找到TSP的更好解决方案。蚂蚁根据其在路径上传播的信息素总量确定的概率,随机选择下一个访问的城市。本文的主要贡献是提出了ACO方法之一在GPU上的复杂而有效的实现。在我们的实现中,我们考虑了GPU体系结构的许多编程问题,包括对全局内存的合并访问,共享内存库冲突等。特别是,我们提出了一种非常有效的方法,用于由许多蚂蚁随机选择下一个城市。我们的新方法使用迭代随机试验,可以以极少的计算成本找到下一个城市。在NVIDIA GeForce GTX 580上的实验结果表明,我们在1002个城市中的实施时间为8.71秒,而传统的CPU实施时间为381.95秒。因此,我们的GPU实现达到了43.47的加速因子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号