首页> 外文会议>IEEE Symposium on Computational Intelligence and Games >Self-Adapting Payoff Matrices in Repeated Interactions
【24h】

Self-Adapting Payoff Matrices in Repeated Interactions

机译:重复交互中的自适应支付矩阵

获取原文

摘要

Traditional iterated prisoner's dilemma (IPD) assumed a fixed payoff matrix for all players, which may not be realistic because not all players are the same in the real-world. This paper introduces a novel co-evolutionary framework where each strategy has its own self-adaptive payoff matrix. This framework is generic to any simultaneous two-player repeated encounter game. Here, each strategy has a set of behavioral responses based on previous moves, and an adaptable payoff matrix based on reinforcement feedback from game interactions that is specified by update rules. We study how different update rules affect the adaptation of initially random payoff matrices, and how this adaptation in turn affects the learning of strategy behaviors.
机译:传统迭代囚犯的困境(IPD)为所有球员承担了固定的支付矩阵,这可能不是现实的,因为并非所有球员都在现实世界中也是如此。 本文介绍了一种新的共同进化框架,每个策略都有自己的自适应支付矩阵。 此框架是通用的任何同时双人反复遇到的游戏。 这里,每个策略具有基于先前移动的一组行为响应,以及基于由更新规则指定的游戏交互的增强反馈的适应性收益矩阵。 我们研究不同的更新规则如何影响最初随机的回报矩阵的适应,以及这种适应如何反过来影响战略行为的学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号