首页> 外国专利> MULTILINGUAL IMAGE QUESTION ANSWERING

MULTILINGUAL IMAGE QUESTION ANSWERING

机译:多语言图像问题回答

摘要

An embodiment of a multimode query answer (mQA) model for answering queries relating to the contents of an image is presented. In an embodiment, the model comprises four members: a short and long term memory (LSTM) member for extracting a query representation, a spiral neural network (CNN) member for extracting a visual representation, and a language context for storing the response. LSTM members, and a fusion member for combining information from the first three members and generating a response. The Freeform Multi-Language Image Query Response (FM-IQA) data set is configured to train and evaluate an embodiment of the mQA model. The quality of the generated response of the mQA model on this data set is evaluated by the Turing test by the human judge.
机译:提出了用于回答与图像内容有关的查询的多模式查询回答(mQA)模型的实施例。在一个实施例中,模型包括四个成员:用于提取查询表示的短期和长期记忆(LSTM)成员,用于提取视觉表示的螺旋神经网络(CNN)成员以及用于存储响应的语言上下文。 LSTM成员,以及一个融合成员,用于合并前三个成员的信息并生成响应。自由格式多语言图像查询响应(FM-IQA)数据集配置为训练和评估mQA模型的实施例。 mQA模型在此数据集上生成的响应的质量由人工判断通过Turing测试进行评估。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号