Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

Paulo R d O Costa; Jason Rhuggenaath; Yingqian Zhang; Alp Akcay

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

【24h】

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

机译：通过深度加固学习学习旅行推销员问题的2-Opt启发式问题

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent works using deep learning to solve the Traveling Salesman Problem (TSP) have focused on learning construction heuristics. Such approaches find TSP solutions of good quality but require additional procedures such as beam search and sampling to improve solutions and achieve state-of-the-art performance. However, few studies have focused on improvement heuristics, where a given solution is improved until reaching a near-optimal one. In this work, we propose to learn a local search heuristic based on 2-opt operators via deep reinforcement learning. We propose a policy gradient algorithm to learn a stochastic policy that selects 2-opt operations given a current solution. Moreover, we introduce a policy neural network that leverages a pointing attention mechanism, which unlike previous works, can be easily extended to more general $k$-opt moves. Our results show that the learned policies can improve even over random initial solutions and approach near-optimal solutions at a faster rate than previous state-of-the-art deep learning methods.

机译：最近使用深度学习解决旅行推销员问题（TSP）的作品专注于学习建设启发式。此类方法找到了良好质量的TSP解决方案，但需要额外的程序，例如光束搜索和采样，以改善解决方案并实现最先进的性能。然而，很少有研究专注于改善启发式，其中给定的解决方案得到改善，直到达到近乎最佳的解决方案。在这项工作中，我们建议通过深度加强学习来学习基于2-OPT运营商的本地搜索启发式。我们提出了一种策略梯度算法来学习考虑到当前解决方案的2-OPT操作的随机策略。此外，我们介绍了一个策略的神经网络，它利用了一个指向的注意力机制，它与以前的作品不同，可以轻松扩展到更多普通$ k $ -opt移动。我们的研究结果表明，即使在随机初始解决方案和接近最佳解决方案，甚至可以以比以前的最先进的深度学习方法更快的速度提高近最佳解决方案。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2020年第2010期|共16页
作者
Paulo R d O Costa; Jason Rhuggenaath; Yingqian Zhang; Alp Akcay;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Deep Reinforcement LearningCombinatorial OptimizationTraveling Salesman Problem.;

机译：深度加强学习学习组织优化制推销商问题。;

相似文献

外文文献
中文文献
专利

1. A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem [J] . Alipour Mir Mohammad, Razavi Seyed Naser, Derakhshi Mohammad Reza Feizi, Neural computing & applications . 2018,第9期

机译：一种使用遗传算法的混合算法和多元素增强学习启发式解决旅行推销员问题
2. The approximation ratio of the 2-Opt Heuristic for the metric Traveling Salesman Problem [J] . Hougardy Stefan, Zaiser Fabian, Zhong Xianghui Operations Research Letters: A Journal of the Operations Research Society of America . 2020,第4期

机译：公制旅行推销员问题的2-opt启发式的近似比
3. Nonoblivious 2-Opt Heuristics for the Traveling Salesman Problem [J] . Asaf Levin, Uri Yovel Networks . 2013,第3期

机译：旅行商问题的非盲2-Opt启发式
4. Applying Deep Learning and Reinforcement Learning to Traveling Salesman Problem [C] . Shoma Miki, Daisuke Yamamoto, Hiroyuki Ebara 2018 International Conference on Computing, Electronics amp; Communications Engineering . 2018

机译：将深度学习和强化学习应用于旅行商问题
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Coverage Path Planning for Decomposition Reconfigurable Grid-Maps Using Deep Reinforcement Learning Based Travelling Salesman Problem [O] . Phone Thiha Kyaw, Aung Paing, Theint Theint Thu, 2020

机译：使用深度加强学习的旅行推销员问题覆盖分解可重构网格地图的覆盖路径规划

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅