首页> 外文期刊>Information Sciences: An International Journal >RepeatPadding: Balancing words and sentence length for language comprehension in visual question answering
【24h】

RepeatPadding: Balancing words and sentence length for language comprehension in visual question answering

机译:重复流动:在视觉问题应答中平衡语言理解的单词和句子长度

获取原文
获取原文并翻译 | 示例
       

摘要

Visual question answering (VQA) is a complicated Turing-AI task which needs not only to understand the multi-modality inputs but also reason to provide correct answer. Nowadays, there are complicated and sophisticated modules for reasoning in popular works. However, the language representation which is frequently treated as the guider of VQA hasn't been fully explored in current researches, leading to insufficient reasoning and unsatisfactory answer. In this work, two types of method including VieAns and Repeat-Padding which focus on language processing are proposed to balance the sentence by cropping and padding the question, where the language information is transformed to different expressions and further pushes the language model to grab more representative features for further boosting the accuracy of predicted answers. Experiments on the benchmark COCO-QA and VQA2.0 datasets are conducted to demonstrate the effectiveness of the proposed method. Particularly, the proposed RepeatPadding method is more suitable for different language models. (C) 2020 Elsevier Inc. All rights reserved.
机译:视觉问题应答(VQA)是一种复杂的TINGS-AI任务,不仅需要了解多模态输入,而且需要提供正确答案的原因。如今,有复杂和复杂的模块,因为在流行的作品中推理。然而,经常被视为VQA的指导器的语言表示尚未在当前的研究中得到充分探索,导致推理不足和令人满意的答案。在这项工作中,提出了两种类型的方法,包括关注语言处理的vieans和重复填充,以通过裁剪和填充该问题来平衡句子,其中语言信息被转换为不同的表达式并进一步推动语言模型以获取更多用于进一步提高预测答案的准确性的代表性特征。对基准Coco-QA和VQA2.0数据集进行实验,以证明所提出的方法的有效性。特别是,所提出的重复流动方法更适合于不同的语言模型。 (c)2020 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号