首页> 外文会议>International Joint Conference on Natural Language Processing;Annual Meeting of the Association for Computational Linguistics >PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling
【24h】

PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling

机译:PhotoChat:具有用于联合图像文本建模的照片共享行为的人为对话数据集

获取原文

摘要

We present a new human-human dialogue dataset - PhotoChat, the first dataset that casts light on the photo sharing behavior in online messaging. PhotoChat contains 12k dialogues, each of which is paired with a user photo that is shared during the conversation. Based on this dataset. we propose two tasks to facilitate research on image-text modeling: a photo-sharing intent prediction task that predicts whether one intends to share a photo in the next conversation turn. and a photo retrieval task that retrieves the most relevant photo according to the dialogue context. In addition, for both tasks, we provide baseline models using the state-of-the-art models and report their benchmark performances. The best image retrieval model achieves 10.4% recall@ 1 (out of 1000 candidates) and the best photo intent prediction model achieves 58.1% F1 score, indicating that the dataset presents interesting yet challenging real-world problems. We are releasing PhotoChat to facilitate future research work among the community.
机译:我们提出了一个新的人类对话数据集 - PhotoChat,第一个数据集,在在线消息传递中的照片共享行为上投入光线。 PhotoChat包含12k对话,每个对话将与在对话期间共享的用户照片配对。基于此数据集。我们提出了两个任务,以促进图像文本建模的研究:预测人们在下次对话转弯中是否打算在下一步共享照片的照片共享意图的预测任务。和一个照片检索任务,根据对话背景下检索最相关的照片。此外,对于两项任务,我们提供了使用最先进的模型提供基线模型,并报告其基准性能。最佳图像检索模型实现10.4%召回@ 1(超过1000名候选者),最佳的照片意向预测模型达到58.1%F1分数,表明数据集呈现有趣但具有挑战性的现实问题。我们正在释放PhotoChat以促进社区之间的未来研究工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号