Directed Policy Search for Decision Making Using Relevance Vector Machines

Ioannis Rexakis; Michail G. Lagoudakis

首页> 外文期刊>International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms >Directed Policy Search for Decision Making Using Relevance Vector Machines

【24h】

Directed Policy Search for Decision Making Using Relevance Vector Machines

机译：相关向量机决策的定向策略搜索

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Several recent learning approaches in decision making under uncertainty suggest the use of classifiers for representing policies compactly. The space of possible policies, even under such structured representations, is huge and must be searched carefully to avoid computationally expensive policy simulations (rollouts). In our recent work, we proposed a method for directed exploration of policy space using support vector classifiers, whereby rollouts are directed to states around the boundaries between different action choices indicated by the separating hyperplanes in the represented policies. While effective, this method suffers from the growing number of support vectors in the underlying classifiers as the number of training examples increases. In this paper, we propose an alternative method for directed policy search based on relevance vector machines. Relevance vector machines are used both for classification (to represent a policy) and regression (to approximate the corresponding relative action advantage function). Classification is enhanced by anomaly detection for accurate policy representation. Exploiting the internal structure of the regressor, we guide the probing of the state space only to critical areas corresponding to changes of action dominance in the underlying policy. This directed focus on critical parts of the state space iteratively leads to refinement and improvement of the underlying policy and delivers excellent control policies in only a few iterations, while the small number of relevance vectors yields significant computational time savings. We demonstrate the proposed approach and compare it with our previous method on standard reinforcement learning domains (inverted pendulum and mountain car).

机译：不确定条件下决策中的几种最新学习方法建议使用分类器来紧凑地表示政策。即使在这样的结构化表示形式下，可能的策略的空间也是巨大的，必须仔细搜索以避免计算量大的策略模拟（推广）。在我们最近的工作中，我们提出了一种使用支持向量分类器对策略空间进行定向探索的方法，其中，将卷展定向到代表策略中由分离的超平面指示的不同操作选择之间的边界周围的状态。尽管有效，但随着训练示例数量的增加，该方法在基础分类器中的支持向量越来越多。在本文中，我们提出了一种基于相关向量机的定向策略搜索的替代方法。相关性向量机既用于分类（代表策略），又用于回归（近似对应的相对行动优势函数）。通过异常检测增强分类，以实现准确的策略表示。利用回归器的内部结构，我们将状态空间的探测仅引导到与基础策略中的操作主导权变化相对应的关键区域。这种直接关注状态空间关键部分的方法可以迭代地改进和改进基础策略，并仅需几次迭代就可以提供出色的控制策略，而少量的相关向量可以节省大量的计算时间。我们演示了提出的方法，并将其与我们先前在标准强化学习领域（倒立摆和山地车）上的方法进行了比较。

著录项

来源
《International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms》 |2014年第4期|共21页
作者
Ioannis Rexakis; Michail G. Lagoudakis;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
Reinforcement learning; Decision making under uncertainty; Classification;

机译：强化学习;不确定条件下的决策;分类;

相似文献

外文文献
中文文献
专利

1. Directed Policy Search for Decision Making Using Relevance Vector Machines [J] . Ioannis Rexakis, Michail G. Lagoudakis International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2014,第4期

机译：相关向量机决策的定向策略搜索
2. A Novel Relevance Vector Machine Classifier with Cuckoo Search Optimization for Spatial Prediction of Landslides [J] . Hoang Nhat-Duc, Bui Dieu Tien Journal of Computing in Civil Engineering . 2016,第5期

机译：具有杜鹃搜索优化的新型关联向量机分类器，用于滑坡的空间预测
3. AN INTEGRATED FRAMEWORK BASED ON TEXTURE FEATURES, CUCKOO SEARCH AND RELEVANCE VECTOR MACHINE FOR MEDICAL IMAGE RETRIEVAL SYSTEM | Science Publications [J] . Ila Vennila, Yogapriya Jaganathan American journal of applied sciences . 2013,第11期

机译：基于纹理特征，杜鹃搜索和相关矢量机的医学图像检索系统集成框架|科学出版物
4. Directed Policy Search Using Relevance Vector Machines [C] . Rexakis Ioannis, Lagoudakis Michail G. IEEE International Conference on Tools with Artificial Intelligence . 2012

机译：使用相关向量机的定向策略搜索
5. Search for the Vector Boson Fusion Production of the Higgs Boson in the H →WW* → lν lν Channel using Support Vector Machines [D] . Wetter, Jeffrey Berwyn 2015

机译：使用支持向量机在H→WW *→lνlν通道中搜索希格斯玻色子的矢量玻色子融合生产
6. Relevance Vector Machine and Support Vector Machine Classifier Analysis of Scanning Laser Polarimetry Retinal Nerve Fiber Layer Measurements [O] . Christopher Bowd, Felipe A. Medeiros, Zuohua Zhang, -1

机译：关联向量机和支持向量机分类器对扫描激光偏振法测定视网膜神经纤维层的分析
7. Predicting hydrofacies and hydraulic conductivity from direct-push data using a data-driven relevance vector machine approach: Motivations, algorithms, and application [O] . Daniel Paradis, René Lefebvre, Erwan Gloaguen, 2015

机译：使用数据驱动相关性矢量机器方法从直接推送数据预测水分法和液压导电性：动机，算法和应用

Directed Policy Search for Decision Making Using Relevance Vector Machines

摘要

著录项

相似文献

相关主题

期刊订阅