首页> 外国专利> UNSUPERVISED DETECTION OF INTERMEDIATE REINFORCEMENT LEARNING GOALS

UNSUPERVISED DETECTION OF INTERMEDIATE REINFORCEMENT LEARNING GOALS

机译：未经监督的中间强化学习目标的检测

页面导航

摘要
著录项
相似文献

摘要

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting intermediate reinforcement learning goals. One of the methods includes obtaining a plurality of demonstration sequences, each of the demonstration sequences being a sequence of images of an environment while a respective instance of a reinforcement learning task is being performed; for each demonstration sequence, processing each image in the demonstration sequence through an image processing neural network to determine feature values for a respective set of features for the image; determining, from the demonstration sequences, a partitioning of the reinforcement learning task into a plurality of subtasks, wherein each image in each demonstration sequence is assigned to a respective subtask of the plurality of subtasks; and determining, from the feature values for the images in the demonstration sequences, a respective set of discriminative features for each of the plurality of subtasks.

机译：用于检测中间强化学习目标的方法，系统和装置，包括在计算机存储介质上编码的计算机程序。该方法之一包括获得多个演示序列，每个演示序列是在执行强化学习任务的各个实例时的环境图像序列;以及对于每个演示序列，通过图像处理神经网络处理演示序列中的每个图像，以确定该图像的各个特征集的特征值;从所述演示序列中确定所述强化学习任务到多个子任务的划分，其中，每个演示序列中的每个图像被分配给所述多个子任务中的相应子任务;根据演示序列中图像的特征值，为多个子任务中的每个子任务分别确定一组判别特征。

著录项

公开/公告号EP3535702A1

专利类型
公开/公告日2019-09-11

原文格式PDF
申请/专利权人 GOOGLE LLC;
展开▼

申请/专利号EP20170801215
发明设计人 SERMANET PIERRE;
展开▼

申请日2017-11-06
分类号G06N3/04;G06N3;G06N3/08;
国家 EP
入库时间 2022-08-21 12:29:56

相似文献

专利
外文文献
中文文献