【24h】

Policy Learning for Continuous Space Security Games Using Neural Networks

机译:使用神经网络的连续空间安全游戏的政策学习

获取原文

摘要

A wealth of algorithms centered around (integer) linear programming have been proposed to compute equilibrium strategies in security games with discrete states and actions. However, in practice many domains possess continuous state and action spaces. In this paper, we consider a continuous space security game model with infinite-size action sets for players and present a novel deep learning based approach to extend the existing toolkit for solving security games. Specifically, we present (i) OptGradFP, a novel and general algorithm that searches for the optimal defender strategy in a parameterized continuous search space, and can also be used to learn policies over multiple game states simultaneously; (ii) OptGradFP-NN, a convolutional neural network based implementation of OptGradFP for continuous space security games. We demonstrate the potential to predict good defender strategies via experiments and analysis of OptGradFP and OptGradFP-NN on discrete and continuous game settings.
机译:已经提出了一种以围绕(整数)线性编程为中心的大量算法来计算安全游戏中的均衡策略与离散状态和行动。 但是,在实践中,许多域具有连续的状态和行动空间。 在本文中,我们考虑了一个连续的空间安全游戏模型,为玩家提供无限大小的动作集,并提出了一种新的基于深度学习的方法来扩展现有工具包来解决安全游戏。 具体而言,我们展示(i)OptGradfp,一种新的和一般算法,用于在参数化的连续搜索空间中搜索最佳防御者策略,并且也可以用于同时学习多个游戏状态的策略; (ii)OPTGRADFP-NN,一种基于卷积神经网络的OPTGRADFP实现,用于连续空间安全游戏。 我们展示了通过在离散和连续游戏设置上的OptGradfp和Optgradfp-Nn的实验和分析来预测良好后卫策略的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号