首页> 外文会议>International Conference for Emerging Technology >Vιsual question answering models Evaluation
【24h】

Vιsual question answering models Evaluation

机译:虚拟问答模型评估

获取原文

摘要

Visual question answering (VQA), visual dialogs, visual chat bot are multi-discipline exploration problems, which is a blend of Natural Language Processing (NLP), Image feature extraction and Knowledge Reasoning (KR). Rather than captioning, which is naïve approach of computer vision, VQA problems enhances the perspective by providing interactivity to ask domain specific as well as open ended questions to images and give us the insights based on image features or characteristics. Our research is the evaluate the performance of VQA on counting problems. Given an image, VQA model is expected to answer "how many" question type. We have used few pre-trained models for VQA and visual dialog and tabulated the findings of accuracy of predicted answer with the pre-defined ground truth.
机译:视觉问答(VQA),视觉对话,视觉聊天机器人是多学科的探索问题,是自然语言处理(NLP),图像特征提取和知识推理(KR)的结合。 VQA问题不是提供字幕(这是计算机视觉的一种简单方法),而是通过提供交互性来向图像提出特定领域的问题和开放式问题,并基于图像的特征或特征为我们提供见解,从而增强了视角。我们的研究是评估VQA在计数问题上的性能。给定一个图像,VQA模型有望回答“多少”个问题类型。我们使用了很少的VQA和视觉对话的预训练模型,并使用预定义的地面事实将预测答案的准确性结果列表化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号