首页> 外国专利> SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT

SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT

机译：文本搜索的语义多传感器嵌入

页面导航

摘要
著录项
相似文献

摘要

A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.

机译：嵌入视频以进行文本搜索的方法包括从视频中提取视觉特征。视觉特征可以例如包括外观信息，运动，音频和/或类似特征。术语向量是根据与视频相关的文字描述确定的。例如，文本可以被包括在视频的标题中或被包括在视频中（例如，字幕）。基于提取的视频特征来计算特征投影，并且基于项向量来计算文本投影。通过共同优化语义可预测性和语义描述性，基于特征投影和文本投影计算语义嵌入。

著录项

公开/公告号WO2017052791A1

专利类型
公开/公告日2017-03-30

原文格式PDF
申请/专利权人 QUALCOMM INCORPORATED;
展开▼

申请/专利号WO2016US45353
发明设计人 HABIBIAN AMIRHOSSEIN;MENSINK THOMAS EDGAR JOSEF;SNOEK CORNELIS GERARDUS MARIA;
展开▼

申请日2016-08-03
分类号G06F17/30;G06N5;
国家 WO
入库时间 2022-08-21 13:31:34

相似文献

专利
外文文献
中文文献