On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata

Xuan Zhang; Ole-Christoffer Granmo; B. John Oommen

首页> 外文期刊>Applied Intelligence >On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata

【24h】

On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata

机译：关于结合离散化和贝叶斯估计的范式以创建新的追求学习自动机族

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

There are currently two fundamental paradigms that have been used to enhance the convergence speed of Learning Automata (LA). The first involves the concept of utilizing the estimates of the reward probabilities, while the second involves discretizing the probability space in which the LA operates. This paper demonstrates how both of these can be simultaneously utilized, and in particular, by using the family of Bayesian estimates that have been proven to have distinct advantages over their maximum likelihood counterparts. The success of LA-based estimator algorithms over the classical, Linear Reward-Inaction (L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive when pursuing actions based on their estimated reward probabilities. Learning should then ideally proceed in progressively larger steps, as the reward probability estimates turn more accurate. This paper introduces a new estimator algorithm, the Discretized Bayesian Pursuit Algorithm (DBPA), that achieves this by incorporating both the above paradigms. The DBPA is implemented by linearly discretizing the action probability space of the Bayesian Pursuit Algorithm (BPA) (Zhang et al. in IEA-AIE 2011, Springer, New York, pp. 608–620, 2011). The key innovation of this paper is that the linear discrete updating rules mitigate the counter-intuitive behavior of the corresponding linear continuous updating rules, by augmenting them with the reward probability estimates. Extensive experimental results show the superiority of DBPA over previous estimator algorithms. Indeed, the DBPA is probably the fastest reported LA to date. Apart from the rigorous experimental demonstration of the strength of the DBPA, the paper also briefly records the proofs of why the BPA and the DBPA are ϵ-optimal in stationary environments.

机译：当前，有两种基本范例已用于提高学习自动机（LA）的收敛速度。第一个涉及利用奖励概率估计的概念，而第二个涉及离散化LA操作所在的概率空间。本文演示了如何同时使用这两种方法，特别是通过使用贝叶斯估计值系列，这些估计值已被证明比最大似然估计值具有明显优势。基于LA的估算器算法在类似于经典线性奖励无所作为（L RI）的方案上的成功，可以用其追求具有最高奖励概率估算值的动作的能力来解释。如果无法获得奖励概率估计值，那么像L RI这样的计划就必须先进行较大的探索步骤，然后通过逐渐减少学习步骤，逐渐将探索转化为开发，这是有意义的。但是，这种行为在基于其估计的奖励概率执行动作时变得反常理。理想情况下，随着奖励概率估计变得更加准确，学习应该逐步逐步进行。本文介绍了一种新的估计器算法，即离散贝叶斯追踪算法（DBPA），该算法通过结合上述两种范例来实现。 DBPA是通过线性离散贝叶斯追踪算法（BPA）的动作概率空间来实现的（Zhang等人，IEA-AIE 2011，纽约Springer，第608-620页，2011）。本文的主要创新之处在于，线性离散更新规则通过使用奖励概率估计值对它们进行扩充，减轻了相应线性连续更新规则的违反直觉的行为。大量的实验结果表明，DBPA优于以前的估算器算法。实际上，DBPA可能是迄今为止报告最快的LA。除了严格实验证明DBPA的强度外，本文还简要记录了为什么BPA和DBPA在固定环境中为ϵ最优的证明。

著录项

来源
《Applied Intelligence》 |2013年第4期|782-792|共11页
作者
Xuan Zhang; Ole-Christoffer Granmo; B. John Oommen;
展开▼
作者单位

Department of ICT University of Agder">(1);

Department of ICT University of Agder">(1);

School of Computer Science Carleton University">(2);

University of Agder">(3);

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Learning automata; Pursuit schemes; Bayesian reasoning; Estimator algorithms; Discretized learning; ε-optimality;

机译：学习自动机;追踪计划;贝叶斯推理;估计器算法;离散学习;ε最优;

相似文献

外文文献
中文文献
专利

1. On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata [J] . Zhang X., Granmo O.-C., Oommen B.J. Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2013,第4期

机译：关于结合离散化和贝叶斯估计的范式以创建新的追求学习自动机族
2. Generalized pursuit learning schemes: new families of continuous and discretized learning automata [J] . Agache M., Oommen B.J. IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics . 2002,第6期

机译：广义追求学习计划：连续和离散学习自动机的新家族
3. Generalized pursuit learning schemes: new families of continuous and discretized learning automata [J] . Agache M., Oommen B.J. IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics . 2002,第6期

机译：广义追求学习计划：连续和离散学习自动机的新家族
4. The Bayesian Pursuit Algorithm: A New Family of Estimator Learning Automata [C] . Xuan Zhang, Ole-Christoffer Granmo, B. John Oommen International conference on industrial engineering and other applications of applied intelligent systems;IEA/AIE 2011 . 2011

机译：贝叶斯追踪算法：估计器学习自动机的新家族
5. From discord to harmony: Creating a stepfamily paradigm [D] . McManus-Gay, Anna Marie 2002

机译：从不和谐到和谐：创建继父家庭范式
6. Bayesian methods outperform parsimony but at the expense of precision in the estimation of phylogeny from discrete morphological data [O] . Joseph E. OReilly, Mark N. Puttick, Luke Parry, 2016

机译：贝叶斯方法优于简约方法但要以不精确的形态学数据估计系统发育为代价
7. On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata [O] . Zhang Xuan, Granmo Ole-Christoffer, Oommen B. John 2013

机译：关于结合离散化和贝叶斯估计的范式以创建新的追求学习自动机族
8. Bayesian Estimation of One Dimensional Discrete Markov Random Fields [R] . Marroquin, J. L. 1984

机译：一维离散马尔可夫随机场的贝叶斯估计

On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata

摘要

著录项

相似文献

相关主题

期刊订阅