Bringing Fairness to Actor-Critic Reinforcement Learning for Network Utility Optimization

机译：为行动者批评批评学习提供公平性，用于网络实用程序优化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Fairness is a crucial design objective in virtually all network optimization problems, where limited system resources are shared by multiple agents. Recently, reinforcement learning has been successfully applied to autonomous online decision making in many network design and optimization problems. However, most of them try to maximize the long-term (discounted) reward of all agents, without taking fairness into account. In this paper, we propose a family of algorithms that bring fairness to actorcritic reinforcement learning for optimizing general fairness utility functions. In particular, we present a novel method for adjusting the rewards in standard reinforcement learning by a multiplicative weight depending on both the shape of fairness utility and some statistics of past rewards. It is shown that for proper choice of the adjusted rewards, a policy gradient update converges to at least a stationary point of general αfairness utility optimization. It inspires the design of fairness optimization algorithms in actor-critic reinforcement learning. Evaluations show that the proposed algorithm can be easily deployed in real-world network optimization problems, such as wireless scheduling and video QoE optimization, and can significantly improve the fairness utility value over previous heuristics and learning algorithms.

机译：公平性是几乎所有网络优化问题的重要设计目标，其中有限的系统资源由多个代理共享。最近，加强学习已成功应用于许多网络设计和优化问题的自主在线决策。然而，大多数人都试图最大限度地提高所有代理商的长期（折扣）奖励，而不考虑公平。在本文中，我们提出了一系列算法，将公平性带来了actorcritic强化学习，以优化一般公平实用功能。特别地，我们提出了一种新的方法，用于通过乘法权重调节标准增强学习中的奖励，这取决于公平效用的形状和过去奖励的一些统计数据。 It is shown that for proper choice of the adjusted rewards, a policy gradient update converges to at least a stationary point of general αfairness utility optimization.它激发了演员批评加固学习中公平优化算法的设计。评估表明，所提出的算法可以轻松地部署在现实世界网络优化问题中，例如无线调度和视频QoE优化，并且可以显着提高先前启发式和学习算法的公平公用事业价值。

著录项

来源
《IEEE Conference on Computer Communications》|2021年|1-10|共10页
会议地点
作者
Jingdi Chen; Yimeng Wang; Tian Lan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Wireless communication; Training; Shape; Heuristic algorithms; Decision making; Reinforcement learning; Scheduling;

机译：无线通信;培训;形状;启发式算法;决策;加强学习;调度;
入库时间 2022-08-26 13:58:26

相似文献

外文文献
中文文献
专利

1. Uplink NOMA-based long-term throughput maximization scheme for cognitive radio networks: an actor-critic reinforcement learning approach [J] . Giang Hoang Thi Huong, Hoan Tran Nhut Khai, Koo Insoo Wireless Networks . 2021,第2期

机译：基于上行的基于NOMA的长期吞吐量最大化方案，用于认知无线电网络：演员 - 评论家强化学习方法
2. An efficient actor-critic reinforcement learning for device-to-device communication underlaying sectored cellular network [J] . Khuntia Pratap, Hazra Ranjay, Chong Peter International journal of communication systems . 2020,第10期

机译：用于设备到设备通信的高效演员批评批评学习界面跨越蜂窝网络
3. Totally model-free actor-critic recurrent neural-network reinforcement learning in non-Markovian domains [J] . Mizutani Eiji, Dreyfus Stuart Annals of Operations Research . 2017,第1期

机译：非马尔可夫域中的完全无模型的actor-critic递归神经网络强化学习
4. Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning [C] . Comsa Ioan Sorin, Sijing Zhang, Aydin Mehmet, IEEE Global Communications Conference . 2014

机译：使用连续参与者批评强化学习的基于自适应比例公平参数化的LTE调度
5. Data-Driven Online Network Optimization Through Reinforcement Learning [D] . Wang, Yimeng. 2021

机译：数据驱动的在线网络优化通过强化学习
6. Reinforcement Learning for Energy Optimization with 5G Communications in Vehicular Social Networks [O] . Hyebin Park, Yujin Lim 2020

机译：车载社交网络中通过5G通信进行能源优化的强化学习
7. Fairness-Aware Link Optimization for Space-Terrestrial Integrated Networks: A Reinforcement Learning Framework [O] . Atefeh Hajijamali Arani, Peng Hu, Yeying Zhu 2021

机译：空间 - 地面集成网络的公平感知链接优化：加强学习框架

Bringing Fairness to Actor-Critic Reinforcement Learning for Network Utility Optimization

摘要

著录项

相似文献

相关主题

期刊订阅