VIDEO INDEXING BASED ON IMAGE AND SOUND

机译：基于图像和声音的视频索引

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Video indexing is a major challenge for both scientific and economic reasons. Information extraction can sometimes be easier from sound channel than from image channel, We first present a multi-channel and multi-modal query interface, to query sound, image and script through "pull" and "push" queries. We then summarize the segmentation phase, which needs information from the image channel. Detection of critical segments is proposed. It should speed-up both automatic and manual indexing. We then present an overview of the information extraction phase. Information can be extracted from the sound channel, through speaker recognition, vocal dictation with unconstrained vocabularies, and script alignment with speech (or "script warping"). We present experiment results for these various techniques. Speaker recognition methods were tested on the TIMIT and NTIMIT database. Vocal dictation was experimented on newspaper sentences spoken by several speakers. Script alignment was tested on part of a cartoon movie, "Ivanhoe". For good quality sound segments, error rates are low enough for use in indexing applications. Major issues are the processing of sound segments with noise or music, and performance improvement through the use of appropriate, low-cost parallel architectures or networks of workstations.

机译：视频索引对科学和经济原因的主要挑战。信息提取有时可以从声道通道更容易地从图像通道更容易，我们首先呈现一个多通道和多模态查询接口，通过“拉”和“推送”查询来查询声音，图像和脚本。然后，我们总结了从图像通道需要信息的分割阶段。提出了临界段的检测。它应该加速自动和手动索引。然后，我们概述了信息提取阶段。信息可以从声道中提取，通过扬声器识别，声音听取与无约束词汇表，以及语音（或“脚本翘曲”的脚本对齐。我们为这些各种技术提出了实验结果。在Timit和NTimit数据库上测试了扬声器识别方法。在几位发言者中讲的报纸句中试验声音。脚本对齐在卡通电影“Ivanhoe”的一部分上进行了测试。对于良好的质量声音段，错误率足够低，以便在索引应用中使用。主要问题是通过使用适当，低成本并行架构或工作站网络来处理具有噪声或音乐的声音段，以及性能改进。

著录项

来源
《Society of Photo-Optical Instrumentation Engineers Conference on Multimedia Storage and Archiving Systems》|1997年||共13页
会议地点
作者
Pascal Faudemay; Claude Montacie; Marie-Jo Caraty;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP334-532;
关键词
Video indexing; query interface; pull queries; push queries; critical segments; speaker recognition; vocal dictation; script warping;

机译：视频索引;查询界面;拉查询;推动查询;临界段;扬声器识别;声音;剧本翘曲;

相似文献

外文文献
中文文献
专利

1. CONTENT-BASED INDEXING OF IMAGES AND VIDEO [J] . Pentland A. Philosophical Transactions of the Royal Society of London, Series B. Biological Sciences . 1997,第1358期

机译：基于内容的图像和视频索引
2. A Caption Text Detection Method from Images/Videos for Efficient Indexing and Retrieval of Multimedia Data [J] . Samabia Tehsin, Asif Masood, Sumaira Kausar, International Journal of Pattern Recognition and Artificial Intelligence . 2015,第1期

机译：从图像/视频的字幕文本检测方法，以有效地索引和检索多媒体数据
3. Survey of Region-Based Text Extraction Techniques for Efficient Indexing of Image/Video Retrieval [J] . Samabia Tehsin, Asif Masood, Sumaira Kausar International Journal of Image, Graphics and Signal Processing . 2014,第12期

机译：基于区域的文本提取技术对图像/视频检索的有效索引
4. VIDEO INDEXING BASED ON IMAGE AND SOUND [C] . Pascal Faudemay, Claude Montacie, Marie-Jo Caraty Society of Photo-Optical Instrumentation Engineers Conference on Multimedia Storage and Archiving Systems . 1997

机译：基于图像和声音的视频索引
5. Wavelet-based coding and indexing of images and video. [D] . Mandal, Mrinal Kumar. 1998

机译：基于小波的图像和视频编码和索引。
6. Content-based indexing of images and video. [O] . A Pentland 1997

机译：基于内容的图像和视频索引。
7. Content-Based Indexing of Images and Video Using Face Detection and Recognition Methods [O] . Stefan Eickeler, Frank Wallhoff, Uri Iurgel, 2001

机译：使用面部检测和识别方法的基于内容的图像和视频索引
8. System for Indexing Multi-Spectral Satellite Images for Efficient Content-Based Retrieval [R] . Barros, J., French, J., Martin, W., 2003

机译：用于高效内容检索的多光谱卫星图像索引系统

VIDEO INDEXING BASED ON IMAGE AND SOUND

摘要

著录项

相似文献

相关主题

期刊订阅