首页> 外文会议>IEEE International Conference on Systems, Man, and Cybernetics >Nash-reinforcement learning (N-RL) for developing coordination strategies in non-transferable utility games
【24h】

Nash-reinforcement learning (N-RL) for developing coordination strategies in non-transferable utility games

机译:纳什强化学习(N-RL),用于在不可转让的实用游戏中制定协调策略

获取原文

摘要

Social (central) planning is normally used in the literature to optimize the system-wide efficiency and utility of multi-operator systems. Central planning tries to maximize system's benefits by coordinating the operators' strategies and reduce the externalities, assuming that all parties are willing to cooperate. This assumption implies that operators are willing to base their decisions based on group rationality rather than individual rationality, even if increased group benefits results in reduced benefits for some agents. This assumption limits the applicability of social planner's solutions, as perfect cooperation among agents is often infeasible in real world. Recognizing the fact that decisions are normally based on individual rationality in human systems, cooperative game theory methods are normally employed to address the major limitation of social planner's methods. Game theory methods revise the social planner's solution such that not only group benefits are increased, but also there exists no agent whose cooperative gain is less than his non-cooperative gain. However, in most cases, utility is assumed to be transferrable and the literature has not sufficiently focused on non-transferrable utility games. In such games parties are willing to cooperate and coordinate their strategies to increase their benefits, but have no ability to compensate each other to promote cooperation. To a good extent, the transferrable utility assumption is due to the complexity of calculations to find the best response strategies of agents in non-cooperative and cooperative modes, especially in multi-period games. By combining Reinforcement Learning and Nash bargaining solution, this paper develops a new method for applying cooperative game theory to complex multi-period non-transferrable utility games. For illustration, the suggested method is applied to two numerical examples in which two hydropower operators seek developing a fair and efficient cooperation mechanism to- increase their gains.
机译:文献中通常使用社会(中央)计划来优化系统范围的效率和多操作员系统的实用性。假设所有各方都愿意合作,中央计划试图通过协调运营商的策略并减少外部性来最大化系统的利益。这种假设意味着,即使增加的团体利益导致某些代理人的利益减少,运营商还是愿意基于群体理性而不是个人理性来做出决策。这种假设限制了社会计划者解决方案的适用性,因为在现实世界中,代理商之间的完美合作通常是不可行的。认识到决策通常基于人类系统中的个人理性这一事实,合作博弈论方法通常用于解决社会计划者方法的主要局限性。博弈论方法修改了社会计划者的解决方案,从而不仅增加了群体利益,而且不存在合作收益小于其非合作收益的主体。但是,在大多数情况下,效用被假定为可转让的,并且文献还没有充分地关注不可转让的效用游戏。在这样的游戏中,各方愿意合作并协调其策略以增加其利益,但没有能力相互补偿以促进合作。在很大程度上,可转移效用假设是由于在非合作和合作模式下(尤其是在多时期博弈中)寻找代理的最佳响应策略的计算复杂性所致。通过结合强化学习和纳什议价解决方案,开发了一种将合作博弈理论应用于复杂的多周期不可转让的效用博弈的新方法。为说明起见,将建议的方法应用于两个数值示例,其中两个水电运营商寻求建立公平有效的合作机制以增加收益。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号