机译:分层和多模式视频字幕:发现视觉的多模式知识并将其转移到语言
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
Smart Systems Institute, National University of Singapore, Singapore;
NUS Graduate School for Integrative Sciences and Engineering National University of Singapore, Singapore;
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
School of Computing, National University of Singapore, Singapore;
Video to text; Semantic discovery; Multi-modal fusion; Deep learning;
机译:基于分层注意的多模式融合,用于视频字幕
机译:MSVD-TARTKISH:用于土耳其语的综合视觉和语言研究的全面多模式视频数据集
机译:具有边界感知分层语言解码和联合视频预测的视频字幕
机译:视频字幕的分层视觉语言对齐
机译:来自视频和自然语言的最小人类监督的多式化学习
机译:带或不带字幕的手语翻译视频中理解过程的比较
机译:analgesiapós-operatóriamultimodalemcirurgiaginecológicavideolaparoscópicaathulatorial:comparaçãentreparecoxib e tenoxicam analgesia pos-operatoria multimodalencirugíaginecológicavideolaparoscópicaathulatorial:comparaciónentreparecoxib y tenoxican门诊视频腹腔镜妇科手术的多模式镇痛:帕瑞考昔与替诺昔康的比较
机译:利用多模态分层Dirichlet过程在线联合学习对象概念和语言模型。