首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >VizWiz Grand Challenge: Answering Visual Questions from Blind People
【24h】

VizWiz Grand Challenge: Answering Visual Questions from Blind People

机译:VizWiz大挑战:回答盲人的视觉问题

获取原文

摘要

The study of algorithms to automatically answer visual questions currently is motivated by visual question answering (VQA) datasets constructed in artificial VQA settings. We propose VizWiz, the first goal-oriented VQA dataset arising from a natural VQA setting. VizWiz consists of over 31,000 visual questions originating from blind people who each took a picture using a mobile phone and recorded a spoken question about it, together with 10 crowdsourced answers per visual question. VizWiz differs from the many existing VQA datasets because (1) images are captured by blind photographers and so are often poor quality, (2) questions are spoken and so are more conversational, and (3) often visual questions cannot be answered. Evaluation of modern algorithms for answering visual questions and deciding if a visual question is answerable reveals that VizWiz is a challenging dataset. We introduce this dataset to encourage a larger community to develop more generalized algorithms that can assist blind people.
机译:当前,在人工VQA设置中构建的视觉问题解答(VQA)数据集推动了对自动回答视觉问题的算法的研究。我们提出VizWiz,这是自然VQA设置产生的第一个面向目标的VQA数据集。 VizWiz包含来自盲人的31,000多个视觉问题,每个盲人都使用手机拍照并记录了口头问题,每个视觉问题有10个众包回答。 VizWiz与许多现有的VQA数据集有所不同,因为(1)图像是由盲人摄影师捕获的,因此通常质量较差;(2)说出了问题,因此交谈性更高;(3)经常无法回答视觉问题。对用于回答视觉问题并确定视觉问题是否可回答的现代算法的评估表明,VizWiz是一个具有挑战性的数据集。我们引入此数据集以鼓励更大的社区开发可以帮助盲人的更通用的算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号