Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

Ohtsuki K.; Bessho K.; Matsuo Y.; Matsunaga S.; Hayashi Y.

首页> 外文期刊>IEEE Signal Processing Magazine >Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

【24h】

Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

机译：自动多媒体索引：结合音频，语音和视觉信息以索引广播新闻

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper describes an indexing system that automatically creates metadata for multimedia broadcast news content by integrating audio, speech, and visual information. The automatic multimedia content indexing system includes acoustic segmentation (AS), automatic speech recognition (ASR), topic segmentation (TS), and video indexing features. The new spectral-based features and smoothing method in the AS module improved the speech detection performance from the audio stream of the input news content. In the speech recognition module, automatic selection of acoustic models achieved both a low WER, as with parallel recognition using multiple acoustic models, and fast recognition, as with the single acoustic model. The TS method using word concept vectors achieved more accurate results than the conventional method using local word frequency vectors. The information integration module provides the functionality of integrating results from the AS module, TS module, and SC module. The story boundary detection accuracy was improved by combining it with the AS results and the SC results compared to the sole TS results.

机译：本文介绍了一种索引系统，该系统通过集成音频，语音和视觉信息自动为多媒体广播新闻内容创建元数据。自动多媒体内容索引系统包括声音分割（AS），自动语音识别（ASR），主题分割（TS）和视频索引功能。 AS模块中基于频谱的新功能和平滑方法从输入新闻内容的音频流中改善了语音检测性能。在语音识别模块中，自动选择声学模型既可以实现低WER（如使用多个声学模型的并行识别），也可以实现快速识别（如使用单个声学模型）。使用词概念向量的TS方法比使用局部词频率向量的常规方法获得了更准确的结果。信息集成模块提供了集成来自AS模块，TS模块和SC模块的结果的功能。通过将其与AS结果和SC结果（与单独的TS结果相比）相结合，可以提高故事边界检测的准确性。

著录项

来源
《IEEE Signal Processing Magazine》 |2006年第2期|p.69-78|共10页
作者
Ohtsuki K.; Bessho K.; Matsuo Y.; Matsunaga S.; Hayashi Y.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类通信理论;
关键词
acoustic signal processing; audio signal processing; database indexing; multimedia databases; speech recognition; acoustic segmentation; audio information; automatic multimedia content indexing system; automatic speech recognition; boundary detection accuracy; inf;

机译：声信号处理;音频信号处理;数据库索引;多媒体数据库;语音识别;声音分割;音频信息;自动多媒体内容索引系统;自动语音识别;边界检测精度;信息;

相似文献

外文文献
中文文献
专利

1. Speech-Based and Video-Supported Indexing of Multimedia Broadcast News [J] . Yoshihiko Hayashi, Katsutoshi Ohtsuki, Katsuji Bessho, ACM SIGIR FORUM . 2003,第Special期

机译：基于语音和视频的多媒体广播新闻索引
2. TIB's Portal for Audiovisual Media: Combining Manual and Automatic Indexing [J] . ANNA LICHTENSTEIN, MARGRET PLANK, JANNA NEUMANN Cataloging & classification quarterly . 2014,第5a8期

机译：TIB视听媒体门户：结合手动索引和自动索引
3. Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination [J] . Tsipas Nikolaos, Vrysis Lazaros, Dimoulas Charalampos, Multimedia Tools and Applications . 2017,第24期

机译：通过基于相似性的语音/音乐区分，高效的音频驱动多媒体索引
4. Speech-Based and Video-Supported Indexing of Multimedia Broadcast News [C] . Yoshihiko Hayashi, Katsutoshi Ohtsuki, Katsuji Bessho, The Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Jul 28-Aug 1, 2003 Toronto, Canada . 2003

机译：基于语音和视频的多媒体广播新闻索引
5. Automatic segmentation, indexing and retrieval of audiovisual data based on combined audio and visual content analysis. [D] . Zhang, Tong. 1999

机译：基于组合的视听内容分析，对视听数据进行自动分段，索引和检索。
6. Combined predictive effects of sentential and visual constraints in early audiovisual speech processing [O] . Heidi Solberg Økland, Ana Todorović, Claudia S. Lüttke, -1

机译：句子和视觉约束在早期视听语音处理中的组合预测效果
7. AUDIO SOURCE SEGMENTATION USING SPECTRAL CORRELATION FEATURES FOR AUTOMATIC INDEXING OF BROADCAST NEWS [O] . Hayashi Yoshihiko, Ohtsuki Katsutoshi, Mizuno Osamu, 2004

机译：利用谱相关特征对广播新闻进行自动索引的音频源分割

Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅