首页> 外国专利> MULTILINGUAL IMAGE QUESTION ANSWERING

MULTILINGUAL IMAGE QUESTION ANSWERING

机译：多语言图像问题回答

页面导航

摘要
著录项
相似文献

摘要

An embodiment of a multimode query answer (mQA) model for answering queries relating to the contents of an image is presented. In an embodiment, the model comprises four members: a short and long term memory (LSTM) member for extracting a query representation, a spiral neural network (CNN) member for extracting a visual representation, and a language context for storing the response. LSTM members, and a fusion member for combining information from the first three members and generating a response. The Freeform Multi-Language Image Query Response (FM-IQA) data set is configured to train and evaluate an embodiment of the mQA model. The quality of the generated response of the mQA model on this data set is evaluated by the Turing test by the human judge.

机译：提出了用于回答与图像内容有关的查询的多模式查询回答（mQA）模型的实施例。在一个实施例中，模型包括四个成员：用于提取查询表示的短期和长期记忆（LSTM）成员，用于提取视觉表示的螺旋神经网络（CNN）成员以及用于存储响应的语言上下文。 LSTM成员，以及一个融合成员，用于合并前三个成员的信息并生成响应。自由格式多语言图像查询响应（FM-IQA）数据集配置为训练和评估mQA模型的实施例。 mQA模型在此数据集上生成的响应的质量由人工判断通过Turing测试进行评估。

著录项

公开/公告号KR101982220B1

专利类型
公开/公告日2019-08-28

原文格式PDF
申请/专利权人 바이두 유에스에이 엘엘씨;
展开▼

申请/专利号KR20177006950
发明设计人 가오 하오위안;마오 준화;저우 제;황 즈헝;왕 레이;쉬 웨이;
展开▼

申请日2016-05-19
分类号G06F17/28;G06F16;G06N3/04;
国家 KR
入库时间 2022-08-21 11:48:29

相似文献

专利
外文文献
中文文献