首页> 外国专利> A learning method and a learning device for performing customized route planning by supporting reinforcement learning by using human travel data as training data

A learning method and a learning device for performing customized route planning by supporting reinforcement learning by using human travel data as training data

机译：通过使用人类旅行数据作为训练数据支持强化学习来执行定制路线计划的学习方法和学习设备

页面导航

摘要
著录项
相似文献

摘要

PROBLEM TO BE SOLVED: To provide a learning method and apparatus for optimizing route planning for each passenger, a test method using the same, and a testing apparatus. In a learning device, a process S01-1, which uses an adjusted reward network to generate first adjusted rewards by referring to information on an actual situation vector and an actual motion included in a traveling locus, The common reward module refers to the information on the actual situation vector and the actual motion to generate the first common reward, and the process S01-2 is used as a prediction network to refer to the actual situation vector and calculate the actual expected values. A step S01-2 of performing a process of generating, a learning device learns a parameter of the adjustment reward network by performing a step S02 of generating an adjustment reward loss and backpropagation with a first loss layer. The step S03 is performed. [Selection diagram] Fig. 3

机译：要解决的问题：提供一种用于为每个乘客优化路线计划的学习方法和设备，使用该学习方法的设备以及一种测试设备。在学习设备中，处理S01-1，其使用调整后的奖励网络通过参考关于行驶轨迹中包括的实际情况矢量和实际运动的信息来生成第一调整后的奖励。实际情况矢量和实际运动产生第一共同报酬，流程S01-2作为预测网络参考实际情况矢量计算实际期望值。在进行生成处理的步骤S01-2中，学习装置通过执行生成调整奖励损失和与第一损失层的反向传播的步骤S02，来学习调整奖励网络的参数。执行步骤S03。 [选择图]图3

著录项

公开/公告号JP2020126646A

专利类型
公开/公告日2020-08-20

原文格式PDF
申请/专利权人株式会社ストラドビジョン;
展开▼

申请/专利号JP20200011163
发明设计人金桂賢;金鎔重;金鶴京;南雲鉉;夫碩▲ふん▼;成明哲;申東洙;呂東勳;柳宇宙;李明春;李炯樹;張泰雄;鄭景中;諸泓模;趙浩辰;
展开▼

申请日2020-01-27
分类号G08G1/16;G08G1;G01C21/34;G06N20;
国家 JP
入库时间 2022-08-21 11:37:47

相似文献

专利
外文文献
中文文献