Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression

机译：通过基于知识的内核回归给予加强学习者的首选行动的建议

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We present a novel formulation for providing advice to a reinforcement learner that employs support-vector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action's value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.

机译：我们提出了一种新的配方，用于向加强学习者提供建议，该钢筋使用支持矢量回归作为其功能近似器。我们的新方法扩展了最近的咨询技术，称为知识的内核回归（KBKR），接受了有关加强学习者的单一动作的建议。在KBKR中，用户可以在某些状态下说，动作的值应大于当前状态的某些线性表达式。在我们新的技术中，我们呼叫偏好KBKR（Pref-Kbkr），用户可以通过推荐在指定的状态集中的另一个操作中优先于另一个操作提供建议。专门指定首选项意味着用户正在提供有关策略而不是Q值的建议，这是人类提供建议的更自然的方式。我们展示了偏好建议的动机和我们将延伸到KBKR的正确性证明。此外，我们展示了我们的方法可以有效地利用新颖的加强学习任务的建议，基于Robocup模拟器，我们称之为突破。我们的作品展示了咨询提供技术的重要潜力，以解决复杂的增强学习问题，同时进一步展示了对增强学习的支持 - 向量回归的使用。

著录项

来源
《National Conference on Artificial Intelligence》|2005年||共6页
会议地点
作者
Richard Maclin; Jude Shavlik; Lisa Torrey; Trevor Walker; Edward Wild;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Integrating regression formulas and kernel functions into locally adaptive knowledge-based neural networks: A case study on renal function evaluation [J] . Qun Song, Nikola Kasabov, Tianmin Ma, Artificial intelligence in medicine . 2006,第3期

机译：将回归公式和核函数整合到基于局部适应性知识的神经网络中：以肾功能评估为例
2. A NEW ALGORITHM OF ENSEMBLE LEARNING FOR MEDICAL KNOWLEDGE-BASED SYSTEMS AND KNOWLEDGE-BASED SYSTEMS: HYBRID BAYESIAN COMPUTING (MULTINOMIAL LOGISTIC REGRESSION CASE-BASED C5.0-MIXED CLASSIFICATION AND REGRESSION TREE) [J] . Patcharaporn Paokanta, Somdet Srichairatanakool International Journal of Innovative Computing Information and Control . 2015,第3期

机译：基于医学知识的系统和基于知识的系统的可学习的新算法：混合贝叶斯计算（基于多项式回归案例的C5.0混合分类和回归树）
3. Interaction dynamics of two reinforcement learners [J] . Walter J. Gutjahr Central European journal of operations research: CEJOR . 2006,第1期

机译：两个强化学习者的互动动力学
4. Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression [C] . Richard Maclin, Jude Shavlik, Lisa Torrey, National Conference on Artificial Intelligence . 2005

机译：通过基于知识的内核回归给予加强学习者的首选行动的建议
5. Research on knowledge-based descriptive cataloging of cartographic publications. An experimental advice-giving system: Mapper. [D] . Ercegovac, Zorana. 1990

机译：基于知识的制图出版物描述性分类研究。实验性建议系统：Mapper。
6. Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions [O] . Cong Shen, Yijie Ding, Jijun Tang, 2018

机译：多元信息融合与快速核学习对预测LncRNA-蛋白质相互作用的核岭回归。
7. Creating Advice-Taking Reinforcement Learners [O] . Richard Maclin, Jude W. Shavlik, Pack Kaelbling 1996

机译：创建忠告强化学习者

Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅