机译:从字幕图像中发现分层对象模型
Department of Computer Science, University of Toronto, 6 King's College Rd., Toronto, Ontario, Canada M5S 3C4;
Department of Computer Science, University of Toronto, 6 King's College Rd., Toronto, Ontario, Canada M5S 3C4;
Department of Computer Science, University of Toronto, 6 King's College Rd., Toronto, Ontario, Canada M5S 3C4;
Department of Computer Science, University of Toronto, 6 King's College Rd., Toronto, Ontario, Canada M5S 3C4;
Department of Computer Science, University of Toronto, 6 King's College Rd., Toronto, Ontario, Canada M5S 3C4;
language-vision integration; object recognition; automatic image annotation; learning hierarchical models;
机译:具有字幕到图像语义构造函数的神经图像字幕模型
机译:分层和多模式视频字幕:发现视觉的多模式知识并将其转移到语言
机译:基于对象的媒体检索的分层图像建模
机译:从字幕图像中发现多部分外观模型
机译:发现图像和视频中的对象。
机译:使用具有对象检测的Motion-CNN的图像标题
机译:从字幕图像中发现多部分外观模型