【24h】

A Multi-modal Video Analysis System

机译:多模态视频分析系统

获取原文

摘要

In this paper,we present a system for Chinese news program management based on cross media video analysis.Audio,caption text and video frames are all important for a person to understand the meaning of the video.Given these facts,we devised a system integrating continuous Chinese speech recognition (ASR),video caption text recognition (VOCR) and object/scene recognition (OR).The news program is firstly segmented to a serial of segments by anchor person detection.Then the ASR and VOCR recognition results are treated as two paragraphs of text,and we translate them to two bags of words to represent the original recognition results.By analysing the correspondance of the words in ASR result and VOCR result,we can get a trusted set of words to depict the video content of a segment of news program.In the last step,we implement the object/scene classification based on the keyframes analysis aided by the above recognition words.Experiments show that our news management system is efficient.
机译:在本文中,我们为基于交叉媒体视频分析的中国新闻节目管理系统提供了一个系统管理系统.Audio,标题文本和视频帧对于一个人来说是了解视频的含义。导致这些事实,我们设计了一个系统集成连续中文语音识别(ASR),视频字幕文本识别(VOCR)和对象/场景识别(或)。首先通过锚点检测分段为序列分段为序列。该ASR和VOCR识别结果被视为两段文本,我们将它们翻译成两袋单词来代表原始识别结果。分析了ASR结果和vocr结果中的单词的对应,我们可以获得可靠的单词,以描绘一个值得描绘的视频内容新闻计划的段。在最后一步中,我们基于以上述识别词提供的关键帧分析来实现对象/场景分类。实验表明我们的新闻管理系统是高效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号