首页> 外国专利> SYSTEM AND METHOD FOR JOINT SPEAKER AND SCENE RECOGNITION IN A VIDEO/AUDIO PROCESSING ENVIRONMENT

SYSTEM AND METHOD FOR JOINT SPEAKER AND SCENE RECOGNITION IN A VIDEO/AUDIO PROCESSING ENVIRONMENT

机译：视频/音频处理环境中的说话人和场景识别的系统和方法

页面导航

摘要
著录项
相似文献

摘要

An example method is provided and includes receiving a media file that includes video data and audio data; determining an initial scene sequence in the media file; determining an initial speaker sequence in the media file; and updating a selected one of the initial scene sequence and the initial speaker sequence in order to generate an updated scene sequence and an updated speaker sequence respectively. The initial scene sequence is updated based on the initial speaker sequence, and wherein the initial speaker sequence is updated based on the initial scene sequence.

机译：提供了一种示例方法，其包括接收包括视频数据和音频数据的媒体文件。确定媒体文件中的初始场景序列;确定媒体文件中的初始说话者顺序;更新初始场景序列和初始说话者序列中的一个，以分别产生更新的场景序列和更新的说话者序列。基于初始说话者序列来更新初始场景序列，并且其中，基于初始场景序列来更新初始说话者序列。

著录项

公开/公告号WO2013170212A1

专利类型
公开/公告日2013-11-14

原文格式PDF
申请/专利权人 CISCO TECHNOLOGY INC.;
展开▼

申请/专利号WO2013US40650
发明设计人 CHOU JIM CHEN;KAJAREKAR SACHIN;CATCHPOLE JASON J.;SANKAR ANANTH;
展开▼

申请日2013-05-10
分类号H04N7/14;G06K9;
国家 WO
入库时间 2022-08-21 15:53:21

相似文献

专利
外文文献
中文文献