首页> 外文会议>IEEE International Conference on Autonomic Computing and Self-Organizing Systems >Reconfigurable Embedded Devices Using Reinforcement Learning to Develop Action-Policies
【24h】

Reconfigurable Embedded Devices Using Reinforcement Learning to Develop Action-Policies

机译:使用强化学习来制定行动策略的可重配置嵌入式设备

获取原文

摘要

The size of sensor networks supporting smart cities is ever increasing. Sensor network resiliency becomes vital for critical networks such as emergency response and waste water treatment. One approach is to engineer ‘self-aware’ sensors that can proactively change their component composition in response to changes in work load when critical devices fail. By extension, these devices could anticipate their own termination, such as battery depletion, and offload current tasks onto connected devices. These neighboring devices can then reconFigure themselves to process these tasks, thus avoiding catastrophic network failure. In this article, we present an array of self-aware sensors who use Q-learning to develop a policy that guides device reaction to various environmental stimuli. The novelty lies in the use of field programmable gate arrays embedded on the sensors that take into account internal system state, configuration, and learned state-action pairs, that guide device decisions in order to meet system demands. Experiments show that even relatively simple reward functions develop Q-learning policies that yield positive device behaviors in dynamic environments.
机译:支持智能城市的传感器网络的大小是不断增加的。传感器网络弹性对关键网络(如应急响应和废水处理)至关重要。一种方法是为工程师的“自我意识”传感器,可以在关键设备失败时响应于工作负载的变化而主动改变其组成组成。通过扩展,这些设备可以预测其自己的终端,例如电池耗尽,并将当前任务卸载到连接的设备上。然后,这些相邻设备可以重新配置自己以处理这些任务,从而避免灾难性的网络故障。在本文中,我们展示了一系列自我意识的传感器,他们使用Q-Leach,制定指导对各种环境刺激的设备反应的政策。新颖性在于使用嵌入在传感器上的现场可编程门阵列,该阵列考虑内部系统状态,配置和学习的状态操作对,以便满足系统要求。实验表明,即使是相对简单的奖励函数也会发展在动态环境中产生正面设备行为的Q学习策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号