首页> 外文期刊>IEEE Signal Processing Magazine >Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news
【24h】

Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

机译:自动多媒体索引:结合音频,语音和视觉信息以索引广播新闻

获取原文
获取原文并翻译 | 示例

摘要

This paper describes an indexing system that automatically creates metadata for multimedia broadcast news content by integrating audio, speech, and visual information. The automatic multimedia content indexing system includes acoustic segmentation (AS), automatic speech recognition (ASR), topic segmentation (TS), and video indexing features. The new spectral-based features and smoothing method in the AS module improved the speech detection performance from the audio stream of the input news content. In the speech recognition module, automatic selection of acoustic models achieved both a low WER, as with parallel recognition using multiple acoustic models, and fast recognition, as with the single acoustic model. The TS method using word concept vectors achieved more accurate results than the conventional method using local word frequency vectors. The information integration module provides the functionality of integrating results from the AS module, TS module, and SC module. The story boundary detection accuracy was improved by combining it with the AS results and the SC results compared to the sole TS results.
机译:本文介绍了一种索引系统,该系统通过集成音频,语音和视觉信息自动为多媒体广播新闻内容创建元数据。自动多媒体内容索引系统包括声音分割(AS),自动语音识别(ASR),主题分割(TS)和视频索引功能。 AS模块中基于频谱的新功能和平滑方法从输入新闻内容的音频流中改善了语音检测性能。在语音识别模块中,自动选择声学模型既可以实现低WER(如使用多个声学模型的并行识别),也可以实现快速识别(如使用单个声学模型)。使用词概念向量的TS方法比使用局部词频率向量的常规方法获得了更准确的结果。信息集成模块提供了集成来自AS模块,TS模块和SC模块的结果的功能。通过将其与AS结果和SC结果(与单独的TS结果相比)相结合,可以提高故事边界检测的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号