Learning to Cooperate in a Search Mission via Policy Search

机译：通过政策搜索学习合作搜索任务

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The dangers of and the time needed when clearing an area from unexploded ordnance can be reduced by a system consisting of unmanned, autonomous robots. The system will need less time when more than one robot cooperates to search the area. The reinforcement learning algorithm GPOMDP is evaluated for the specific case of finding a decision rule that, given a map and the robot's position on the map, enables the robot to automatically choose between different possible actions. The actions lead to a near optimal path through an area where some parts need to be searched. A neural network is used as a function approximator to store and improve the decision rule, and also to find actions according to it. The problem expanded to include two robots using the same decision rule, distributed in a sense that the robots pick actions according to their own perception of the surroundings and independent of the other robot's action. To achieve cooperation between the robots, they are trained to maximize a shared reward that is equal to the sum of individual rewards that are given according to the consequences of the robots' actions. When using the learnt policy to search the largest of the experiment's areas, two robots that have been trained with a shared reward use 70% of the time that one optimal robot would need, while two agents that have been trained with their individual rewards need 88%.

著录项

作者
Martin, D.;
展开▼
作者单位

展开▼
年度 2002
页码 p.1-62
总页数 62
原文格式 PDF
正文语种 eng
中图分类工业技术 ;
关键词
Policy searches; Robots; Neural networks;

机译：政策搜索;机器人;神经网络;

相似文献

外文文献
中文文献
专利

1. Mission Impossible? Search Engines'Ongoing Search For a Viable Global Keyword Policy [J] . Noam Shemtov Journal of internet law . 2009 ,第3期

机译：不可能完成的任务？搜索引擎的持续搜索，以寻求可行的全球关键字政策
2. Policy learning to reduce inequalities: the search for a coherent Scottish gender mainstreaming policy in a multilevel UK [J] . Cairney Paul, St Denny Emily, Kippin Sean Territory, politics, governance . 2021 ,第3期

机译：减少不平等的政策：在多级英国寻找一致的苏格兰性别主流化政策
3. Improving RTS Game AI by Supervised Policy Learning, Tactical Search, and Deep Reinforcement Learning [J] . Barriga Nicolas A., Stanescu Marius, Besoain Felipe, IEEE computational intelligence magazine . 2019 ,第3期

机译：通过监督策略学习，战术搜索和深度强化学习来改善RTS Game AI
4. Learning to Cooperate via Policy Search [C] . Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Conference on uncertainty in artificial intelligence . 2000

机译：通过策略搜索学习合作
5. Comparison of path-planning and search methods for cooperating unmanned aerial vehicles. [D] . Spritzer, Zachary Wilson. 2004

机译：协作无人飞行器的路径规划和搜索方法的比较。
6. Autonomous Unmanned Aerial Vehicles in Search and Rescue Missions Using Real-Time Cooperative Model Predictive Control [O] . Fabio Augusto de Alcantara Andrade, Anthony Reinier Hovenburg, Luciano Netto de Lima, 2019

机译：实时协作模型预测控制在搜索和救援任务中的自主无人机
7. Learning to Cooperate via Policy Search [O] . Peshkin, Leonid, Kim, Kee-Eung, Meuleau, Nicolas, 2001

机译：学会通过政策搜索进行合作

Learning to Cooperate in a Search Mission via Policy Search

摘要

著录项

相似文献

相关主题

期刊订阅