On How to Learn from a Stochastic Teacher or a Stochastic Compulsive Liar of Unknown Identity

机译：关于如何向随机老师或身份未知的随机强迫说谎者学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of a learning mechanism (robot, or algorithm) that learns a parameter while interacting with either a stochastic teacher or a stochastic compulsive liar. The problem is modeled as follows: the learning mechanism is trying to locate an unknown point on a real interval by interacting with a stochastic environment through a series of guesses. For each guess the environment (teacher) essentially informs the mechanism, possibly erroneously, which way it should move to reach the point. Thus, there is a non-zero probability that the feedback from the environment is erroneous. When the probability of correct response is p > 0.5, the environment is said to be Informative, and we have the case of learning from a stochastic teacher. When this probability is p < 0.5 the environment is deemed Deceptive, and is called a stochastic compulsive liar. This paper describes a novel learning strategy by which the unknown parameter can be learned in both environments. To the best of our knowledge, our results are the first reported results which are applicable to the latter scenario. Another main contribution of this paper is that the proposed scheme is shown to operate equally well even when the learning mechanism is unaware whether the environment is Informative or Deceptive. The learning strategy proposed herein, called CPL-ATS, partitions the search interval into three equi-sized sub-intervals, evaluates the location of the unknown point with respect to these sub-intervals using fast-converging e-optimal L_(RI) learning automata, and prunes the search space in each iteration by eliminating at least one partition. The CPL-ATS algorithm is shown to be provably converging to the unknown point to an arbitrary degree of accuracy with probability as close to unity as desired. Comprehensive experimental results confirm the fast and accurate convergence of the search for a wide range of values for the environment's feedback accuracy parameter p. The above algorithm can be used to learn parameters for non-linear optimization techniques.

机译：我们考虑一种学习机制（机器人或算法）的问题，该机制在与随机教师或随机强迫说谎者互动时学习参数。问题建模如下：学习机制通过一系列猜测与随机环境进行交互，试图在真实间隔上定位未知点。对于每个猜测，环境（教师）从本质上可能会错误地告知该机制，该机制应以何种方式达到目标。因此，来自环境的反馈有错误的可能性为非零。当正确回答的概率为p> 0.5时，就可以说环境是信息性的，我们有向随机老师学习的情况。当此概率p <0.5时，环境被认为具有欺骗性，被称为随机强迫说谎者。本文介绍了一种新颖的学习策略，通过该策略可以在两种环境中学习未知参数。据我们所知，我们的结果是第一个报告的结果，适用于后一种情况。本文的另一个主要贡献是，即使学习机制不知道环境是信息性的还是欺骗性的，所提出的方案也能很好地运行。本文中提出的称为CPL-ATS的学习策略将搜索间隔分为三个等长的子间隔，并使用快速收敛的e-最优L_（RI）学习相对于这些子间隔评估未知点的位置自动执行，并通过消除至少一个分区来在每次迭代中修剪搜索空间。已显示CPL-ATS算法以任意精度接近任意所需的概率证明可以收敛到未知点。全面的实验结果证实，对于环境的反馈精度参数p，可以找到各种值的快速而准确的收敛。以上算法可用于学习非线性优化技术的参数。

著录项

来源
《AI 2003: Advances in Artificial Intelligence》|2003年|p.24-40|共17页
会议地点 Perth(AU);Perth(AU)
作者
B. John Oommen; Govindachari Raghunath; Benjamin Kuipers;
展开▼
作者单位

Fellow of the IEEE. School of Computer Science Carleton University, Ottawa, ON, K1S 5B6, Canada;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
learning automata; statistical learning; parameter estimation; stochastic optimization; pattern recognition;

机译：学习自动机;统计学习;参数估计;随机优化;模式识别;

相似文献

外文文献
中文文献
专利

1. Parameter learning from stochastic teachers and stochastic compulsive liars [J] . Oommen B.J., Raghunath G., Kuipers B. IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics . 2006,第4期

机译：向随机教师和随机强迫说谎者学习参数
2. Adaptive neural control for a class of stochastic nonlinear systems with unknown parameters, unknown nonlinear functions and stochastic disturbances [J] . Chen Chao-Yang, Gui Wei-Hua, Guan Zhi-Hong, Neurocomputing . 2017,第FEBa22期

机译：一类参数未知，非线性函数未知和随机干扰的随机非线性系统的自适应神经控制
3. Stabilization of Stochastically Singular Nonlinear Jump Systems with Unknown Parameters and Continuously Distributed Delays [J] . Quanxin Zhu International Journal of Control, Automation, and Systems . 2013,第4期

机译：参数未知且具有连续分布时滞的随机奇异非线性跳跃系统的镇定
4. On How to Learn from a Stochastic Teacher or a Stochastic Compulsive Liar of Unknown Identity [C] . B. John Oommen, Govindachari Raghunath, Benjamin Kuipers Australian Conference on Artificial Intelligence . 2003

机译：关于如何从随机教师或一个未知身份的随机强迫骗子中学习
5. Sequential Stochastic Assignment with Unknown Worker Quality. [D] . Nambiar, Siddhartha. 2015

机译：具有未知工作人员质量的顺序随机分配。
6. The 5-formyltetrahydrofolate futile cycle reduces pathway stochasticity in an extended hybrid-stochastic model of folate-mediated one-carbon metabolism [O] . Karla Misselbeck, Luca Marchetti, Corrado Priami, -1

机译：5-甲酰基四氢叶酸的无用循环在叶酸介导的一碳代谢的扩展混合-随机模型中降低了路径的随机性
7. Adaptive Neural Control for a Class of Stochastic Nonlinear Systems with Unknown Parameters, Unknown Nonlinear Functions and Stochastic Disturbances [O] . Chena, Chao-Yang, Gui, Wei-Hua, Guan, Zhi-Hong, 2017

机译：一类随机非线性系统的自适应神经网络控制未知参数，未知非线性函数和随机扰动

On How to Learn from a Stochastic Teacher or a Stochastic Compulsive Liar of Unknown Identity

摘要

著录项

相似文献

相关主题

期刊订阅