首页> 外国专利> ONLINE LEARNING METHOD AND VEHICLE CONTROL METHOD BASED ON REINFORCEMENT LEARNING WITHOUT ACTIVE SEARCH

ONLINE LEARNING METHOD AND VEHICLE CONTROL METHOD BASED ON REINFORCEMENT LEARNING WITHOUT ACTIVE SEARCH

机译：基于主动学习的基于强化学习的在线学习方法和车辆控制方法

页面导航

摘要
著录项
相似文献

摘要

PROBLEM TO BE SOLVED: To provide a computer execution type method for adaptively controlling an autonomous operation of a vehicle.SOLUTION: A critic network in a computer processing system configured so as to autonomously control a vehicle has the steps of: determining an estimated average cost and an approximated arrival cost function which generates a minimum value for an arrival cost of a vehicle when applied by an actor network by using a sample of data passively collected and a state cost; and determining a control input which generates a minimum value for the arrival cost by being applied to the vehicle in the actor network operatively connected with respect to the critic network. The actor network determines the control input by estimating a noise level, using the average cost, the arrival cost determined from the approximated arrival cost function, a dynamic value for control for a current state of the vehicle and the passively collected data.SELECTED DRAWING: Figure 3

机译：解决的问题：提供一种用于自适应地控制车辆的自主操作的计算机执行类型的方法。解决方案：被配置为自主控制车辆的计算机处理系统中的批评者网络具有以下步骤：确定估计的平均成本近似到达成本函数，其通过使用被动收集的数据样本和状态成本来生成由参与者网络应用时车辆的到达成本的最小值;确定控制输入，该控制输入通过被应用于相对于评论家网络可操作地连接的演员网络中的车辆而产生到达成本的最小值。演员网络通过使用平均成本，从近似到达成本函数确定的到达成本，用于控制车辆当前状态的动态值以及被动收集的数据来估算噪声水平，从而确定控制输入。图3

著录项

公开/公告号JP2018037064A

专利类型
公开/公告日2018-03-08

原文格式PDF
申请/专利权人 TOYOTA MOTOR ENGINEERING & MANUFACTURING NORTH AMERICA INC;
展开▼

申请/专利号JP20170131700
发明设计人 NISHI TOMOKI;
展开▼

申请日2017-07-05
分类号G05B13/02;G06N99;
国家 JP
入库时间 2022-08-21 13:09:37

相似文献

专利
外文文献
中文文献