Reward-based online learning in non-stationary environments: Adapting a P300-speller with a “backspace” key

机译：在非固定环境中基于奖励的在线学习：使用“退格”键修改P300-Speller

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We adapt a policy gradient approach to the problem of reward-based online learning of a non-invasive EEG-based “P300”-speller. We first clarify the nature of the P300-speller classification problem and present a general regularized gradient ascent formula. We then show that when the reward is immediate and binary (namely “bad response” or “good response”), each update is expected to improve the classifier accuracy, whether the actual response is correct or not. We also estimate the robustness of the method to occasional mistaken rewards, i.e. show that the learning efficacy may only linearly decrease with the rate of invalid rewards. The effectiveness of our approach is tested in a series of simulations reproducing the conditions of real experiments. We show in a first experiment that a systematic improvement of the spelling rate is obtained for all subjects in the absence of initial calibration. In a second experiment, we consider the case of the online recovery that is expected to follow failed electrodes. Combined with a specific failure detection algorithm, the spelling error information (typically contained in a “backspace” hit) is shown useful for the policy gradient to adapt the P300 classifier to the new situation, provided the feedback is reliable enough (namely having a reliability greater than 70%).

机译：我们采用一种策略梯度方法来解决基于无创EEG的“ P300” -speller的基于奖励的在线学习问题。我们首先弄清P300-Speller分类问题的性质，并提出一个通用的正则化梯度上升公式。然后，我们表明，当奖励是立即和二进制的（即“不良响应”或“良好响应”）时，无论实际响应是否正确，每次更新都有望提高分类器的准确性。我们还估计了该方法对偶尔的错误奖励的鲁棒性，即表明学习功效可能仅随着无效奖励的比率线性降低。我们的方法的有效性在再现真实实验条件的一系列模拟中进行了测试。我们在第一个实验中表明，在没有初始校准的情况下，所有受试者的拼写率都得到了系统的提高。在第二个实验中，我们考虑了预计将在故障电极之后进行在线恢复的情况。结合特定的故障检测算法，显示拼写错误信息（通常包含在“退格”命中中）对于策略梯度使P300分类器适应新情况非常有用，前提是反馈足够可靠（即具有可靠性）。大于70％）。

著录项

来源
《International Joint Conference on Neural Networks》|2015年|1-8|共8页
会议地点
作者
Dauce Emmanuel; Proix Timothee; Ralaivola Liva;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Brain-Computer Interfaces; Online learning; P300 speller; Policy gradient; Reinforcement learning;

机译：脑机接口;在线学习; P300拼写器;策略梯度;强化学习;

相似文献

外文文献
中文文献
专利

1. Online Ensemble Multi-kernel Learning Adaptive to Non-stationary and Adversarial Environments [J] . Yanning Shen, Tianyi Chen, Georgios Giannakis JMLR: Workshop and Conference Proceedings . 2018,第4期

机译：适应非平稳和对抗性环境的在线集成多核学习
2. A heterogeneous online learning ensemble for non-stationary environments [J] . Knowledge-Based Systems . 2020,第Jana5期

机译：适用于非平稳环境的异构在线学习合奏
3. Adaptive and on-line learning in non-stationary environments [J] . Edwin Lughofer, Moamar Sayed-Mouchaweh Evolving Systems . 2015,第2期

机译：非平稳环境中的自适应和在线学习
4. Reward-based online learning in non-stationary environments: Adapting a P300-speller with a "Backspace" key [C] . Emmanuel Daucé, Timothée Proix, Liva Ralaivola International Joint Conference on Neural Networks . 2015

机译：在非静止环境中基于奖励的在线学习：使用“退格”键适应P300拼写
5. Adaptive Guidance for Online Learning Environments [D] . Bassen, Jonathan. 2020

机译：在线学习环境的自适应指导
6. An Adaptive Heterogeneous Online Learning Ensemble Classifier for Nonstationary Environments [O] . Tinofirei Museba, Fulufhelo Nelwamondo, Khmaies Ouahada 2021

机译：用于非营养环境的自适应异构在线学习集分类
7. Adaptive and on-line learning in non-stationary environments [O] . Edwin Lughofer, Moamar Sayed-Mouchaweh 2015

机译：非静止环境的自适应和在线学习

Reward-based online learning in non-stationary environments: Adapting a P300-speller with a “backspace” key

摘要

著录项

相似文献

相关主题

期刊订阅