...
首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning
【24h】

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

机译:通过深度加固学习学习旅行推销员问题的2-Opt启发式问题

获取原文
           

摘要

Recent works using deep learning to solve the Traveling Salesman Problem (TSP) have focused on learning construction heuristics. Such approaches find TSP solutions of good quality but require additional procedures such as beam search and sampling to improve solutions and achieve state-of-the-art performance. However, few studies have focused on improvement heuristics, where a given solution is improved until reaching a near-optimal one. In this work, we propose to learn a local search heuristic based on 2-opt operators via deep reinforcement learning. We propose a policy gradient algorithm to learn a stochastic policy that selects 2-opt operations given a current solution. Moreover, we introduce a policy neural network that leverages a pointing attention mechanism, which unlike previous works, can be easily extended to more general $k$-opt moves. Our results show that the learned policies can improve even over random initial solutions and approach near-optimal solutions at a faster rate than previous state-of-the-art deep learning methods.
机译:最近使用深度学习解决旅行推销员问题(TSP)的作品专注于学习建设启发式。此类方法找到了良好质​​量的TSP解决方案,但需要额外的程序,例如光束搜索和采样,以改善解决方案并实现最先进的性能。然而,很少有研究专注于改善启发式,其中给定的解决方案得到改善,直到达到近乎最佳的解决方案。在这项工作中,我们建议通过深度加强学习来学习基于2-OPT运营商的本地搜索启发式。我们提出了一种策略梯度算法来学习考虑到当前解决方案的2-OPT操作的随机策略。此外,我们介绍了一个策略的神经网络,它利用了一个指向的注意力机制,它与以前的作品不同,可以轻松扩展到更多普通$ k $ -opt移动。我们的研究结果表明,即使在随机初始解决方案和接近最佳解决方案,甚至可以以比以前的最先进的深度学习方法更快的速度提高近最佳解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号