首页> 外国专利> SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT

SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT

机译:文本搜索的语义多传感器嵌入

摘要

A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.
机译:嵌入视频以进行文本搜索的方法包括从视频中提取视觉特征。视觉特征可以例如包括外观信息,运动,音频和/或类似特征。术语向量是根据与视频相关的文字描述确定的。例如,文本可以被包括在视频的标题中或被包括在视频中(例如,字幕)。基于提取的视频特征来计算特征投影,并且基于项向量来计算文本投影。通过共同优化语义可预测性和语义描述性,基于特征投影和文本投影计算语义嵌入。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号