首页> 外文会议>2013 Eighth International Conference and Exhibition on Ecological Vehicles and Renewable Energies >Modeling of Ambient Comfort Affect Reward based on multi-agents in cloud interconnection environment for developing the sustainable home controller
【24h】

Modeling of Ambient Comfort Affect Reward based on multi-agents in cloud interconnection environment for developing the sustainable home controller

机译:云互连环境中基于多主体的环境舒适感奖励建模,用于开发可持续家庭控制器

获取原文
获取原文并翻译 | 示例

摘要

The paper presents a research based on a vision of a multi-agent model working for the ambient comfort measurement and environment control system. Such means are used for developing the Smarter Eco-Social Laboratory (SrESL). The human Ambient Comfort Affect Reward (ACAR) index is proposed for development of the Reinforcement Learning Based Ambient Comfort Controller (RL-ACC) for experiments using equipment of SrESL. The ACAR index is recognized as dependent on human physiological parameters, such as the temperature, the electrocardiogram (ECG) and the electro-dermal activity (EDA). The fuzzy logic is used to approximate the ACAR index function by defining two fuzzy inference systems: the Arousal-Valence System, and the Ambient Comfort Affect Reward (ACAR) System. The goal of the RL-ACC is to find such the environmental state characteristics that create an optimal comfort for people affected by this environment. The Radial Basis Neural Network is used as the main component of the RL-ACC to performing of two roles: the policy structure, known as the Actor, used to select actions, and the estimated value function, known as the Critic that criticizes the actions made by the Actor. The Actor which manages Critic processes was used as a value function approximation of the continuous learning tasks of the RL-ACC and presented in this paper.
机译:本文提出了一种基于多主体模型的研究成果,该模型可用于环境舒适度测量和环境控制系统。这些手段用于开发更智能的生态社会实验室(SrESL)。提出了人类环境舒适感奖励(ACAR)指数,用于开发基于强化学习的环境舒适感控制器(RL-ACC),用于使用SrESL的设备进行实验。 ACAR指数被认为取决于人类的生理参数,例如温度,心电图(ECG)和皮肤电活动(EDA)。通过定义两个模糊推论系统,使用模糊逻辑来近似ACAR指标函数:Arousal-Valence系统和环境舒适感奖励(ACAR)系统。 RL-ACC的目标是找到能够为受此环境影响的人们提供最佳舒适度的环境状态特征。径向基神经网络被用作RL-ACC的主要组件,以执行两个角色:用于选择操作的策略结构(称为执行者)和用于批评操作的估计值函数(称为评论家)由演员制作。本文主要介绍了管理批评过程的Actor作为RL-ACC连续学习任务的价值函数近似值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号