首页> 外国专利> SYSTEM AND METHOD FOR JOINT SPEAKER AND SCENE RECOGNITION IN A VIDEO/AUDIO PROCESSING ENVIRONMENT

SYSTEM AND METHOD FOR JOINT SPEAKER AND SCENE RECOGNITION IN A VIDEO/AUDIO PROCESSING ENVIRONMENT

机译:视频/音频处理环境中的说话人和场景识别的系统和方法

摘要

An example method is provided and includes receiving a media file that includes video data and audio data; determining an initial scene sequence in the media file; determining an initial speaker sequence in the media file; and updating a selected one of the initial scene sequence and the initial speaker sequence in order to generate an updated scene sequence and an updated speaker sequence respectively. The initial scene sequence is updated based on the initial speaker sequence, and wherein the initial speaker sequence is updated based on the initial scene sequence.
机译:提供了一种示例方法,其包括接收包括视频数据和音频数据的媒体文件。确定媒体文件中的初始场景序列;确定媒体文件中的初始说话者顺序;更新初始场景序列和初始说话者序列中的一个,以分别产生更新的场景序列和更新的说话者序列。基于初始说话者序列来更新初始场景序列,并且其中,基于初始场景序列来更新初始说话者序列。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号