首页>
外国专利>
A learning method and a learning device for performing customized route planning by supporting reinforcement learning by using human travel data as training data
A learning method and a learning device for performing customized route planning by supporting reinforcement learning by using human travel data as training data
展开▼
机译:通过使用人类旅行数据作为训练数据支持强化学习来执行定制路线计划的学习方法和学习设备
展开▼
页面导航
摘要
著录项
相似文献
摘要
PROBLEM TO BE SOLVED: To provide a learning method and apparatus for optimizing route planning for each passenger, a test method using the same, and a testing apparatus. In a learning device, a process S01-1, which uses an adjusted reward network to generate first adjusted rewards by referring to information on an actual situation vector and an actual motion included in a traveling locus, The common reward module refers to the information on the actual situation vector and the actual motion to generate the first common reward, and the process S01-2 is used as a prediction network to refer to the actual situation vector and calculate the actual expected values. A step S01-2 of performing a process of generating, a learning device learns a parameter of the adjustment reward network by performing a step S02 of generating an adjustment reward loss and backpropagation with a first loss layer. The step S03 is performed. [Selection diagram] Fig. 3
展开▼