Modification of Q-learning to Adapt to the Randomness of Environment

机译：修改Q学习以适应环境的随机性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Q-learning is a typical model-free algorithm in reinforcement learning to achieve a goal by interacting with an uncertain environment. However, conventional Q-learning cannot reach convergence and even learns bad policies when the state transition and the immediate reward of the environment are randomly distributed. This paper gives a modification of the Q-learning algorithm by exploring a Monte Carlo method to settle the above problems. Furthermore, simulation experiments are performed to validate the modified Q-learning algorithm.

机译：Q学习是强化学习中的一种典型的无模型算法，可通过与不确定的环境进行交互来实现目标。然而，当状态转换和环境的立即奖励随机分布时，常规的Q学习无法达到收敛甚至学习不好的策略。通过探索蒙特卡洛方法来解决上述问题，本文对Q学习算法进行了修改。此外，进行仿真实验以验证改进的Q学习算法。

著录项

来源
《》|2019年|1-4|共4页
会议地点
作者
Xiulian Luo; Youbing Gao; Shao Huang; Yaodong Zhao; Shengmiao Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Convergence; Machine learning; Laboratories; Automation; Learning (artificial intelligence); Monte Carlo methods; Training;

机译：融合;机器学习;实验室;自动化;学习（人工智能）;蒙特卡洛方法;培训;

相似文献

外文文献
中文文献
专利

1. Adaptive packet scheduling in IoT environment based on Q-learning [J] . Kim Donghyun, Lee Taeho, Kim Sejun, Journal of ambient intelligence and humanized computing . 2020 ,第6期

机译：基于Q-Learning的IOT环境中的自适应数据包调度
2. Adaptive Packet Scheduling in IoT Environment Based on Q-learning [J] . Donghyun Kim, Taeho Lee, Sejun Kim, Procedia Computer Science . 2018 ,第5期

机译：基于Q-Learning的IOT环境中的自适应数据包调度
3. An Adjustment Method of the Number of States on Q-Learning Segmenting State Space Adaptively [J] . Tomoki Hamagami, Seiichi Koakutsu, Hironori Hirata Electronics and Communications in Japan . 2007 ,第9期

机译：Q学习局部分割状态空间的状态数调整方法
4. Modification of Q-learning to Adapt to the Randomness of Environment [C] . Xiulian Luo, Youbing Gao, Shao Huang, International Conference on Control, Automation and Information Sciences . 2019

机译：改进Q学习以适应环境的随机性
5. Adaptive Interventions Treatment Modelling and Regimen Optimization Using Sequential Multiple Assignment Randomized Trials (SMART) and Q-Learning [D] . Baniya, Abiral. 2018

机译：自适应干预治疗建模和方案优化使用顺序多分配随机试验（SMART）和Q-Learning
6. Adaptive Learning Recommendation Strategy Based on Deep Q-learning [O] . Chunxi Tan, Ruijian Han, Rougang Ye, 2020

机译：基于深度Q学习的自适应学习推荐战略
7. BLER-based Adaptive Q-learning for Efficient Random Access in NOMA-based mMTC Networks [O] . Duc-Dung Tran, Shree Krishna Sharma, Symeon Chatzinotas 2021

机译：基于BLER的自适应Q学习，用于基于NOMA的MMTC网络中有效随机访问
8. Adaptively Optimizing the Algorithms for Adaptive Antenna Arrays for Randomly Time-Varying Mobile Communications Systems [R] . Buche, R. T. , Kushner, H. J. 2002

机译：自适应优化随机时变移动通信系统自适应天线阵列算法

Modification of Q-learning to Adapt to the Randomness of Environment

摘要

著录项

相似文献

相关主题

期刊订阅