首页> 外国专利> Online learning and vehicle control method based on reinforcement learning without active exploration

Online learning and vehicle control method based on reinforcement learning without active exploration

机译：基于主动学习的基于强化学习的在线学习与车辆控制方法

页面导航

摘要
著录项
相似文献

摘要

A computer-implemented method of adaptively controlling an autonomous operation of a vehicle is provided. The method includes steps of (a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and (b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle that produces the minimum value for the cost-to-go, wherein the actor network is configured to determine the control input by estimating a noise level using the average cost, a cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the passively collected data.

机译：提供了一种计算机实施的自适应地控制车辆的自主操作的方法。该方法包括以下步骤：（a）在配置为自主控制车辆的计算系统中的评论家网络中，使用被动收集的数据和状态成本的样本来确定估计的平均成本和近似成本。当由参与者网络应用时，产生车辆行驶成本的最小值; （b）在计算系统中的操作者网络中并且可操作地耦合到评论者网络，确定要施加到车辆的控制输入，该控制输入产生产生的行驶成本的最小值，其中，操作者网络被配置为确定通过使用平均成本，根据近似成本函数确定的成本，车辆当前状态的控制动态以及被动收集的数据估算噪声水平来控制输入。

著录项

公开/公告号US10065654B2

专利类型
公开/公告日2018-09-04

原文格式PDF
申请/专利权人 TOYOTA MOTOR ENGINEERING & MANUFACTURING NORTH AMERICA INC.;
展开▼

申请/专利号US201615205558
发明设计人 TOMOKI NISHI;
展开▼

申请日2016-07-08
分类号G05D1;B60W50/06;G05B13/04;G05B13/02;B60W50;
国家 US
入库时间 2022-08-21 13:02:23

相似文献

专利
外文文献
中文文献