Multi-modal information retrieval from broadcast video using OCR and speech recognition

机译：使用OCR和语音识别从广播视频中检索多模式信息

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We examine multi-modal information retrieval from broadcast video where text can be read on the screen through OCR and speech recognition can be performed on the audio track. OCR and speech recognition are compared on the 2001 TREC Video Retrieval evaluation corpus. Results show that OCR is more important that speech recognition for video retrieval. OCR retrieval can further improve through dictionary-based post-processing. We demonstrate how to utilize imperfect multi-modal metadata results to benefit multi-modal information retrieval.

机译：我们研究了从广播视频中检索多模式信息的方法，其中可以通过OCR在屏幕上读取文本，并且可以在音轨上执行语音识别。 OCR和语音识别在2001 TREC Video Retrieval评估语料库中进行了比较。结果表明，OCR比语音识别对视频检索更重要。通过基于字典的后处理，OCR检索可以进一步改善。我们演示了如何利用不完善的多模式元数据结果来受益于多模式信息检索。

著录项

来源
《ACM/IEEE-CS joint conference on Digital libraries》|2002年|P.160-161|共2页
会议地点
作者
Alexander G. Hauptmann; Rong Jin; Tobun Dorbin Ng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电子图书馆、数字图书馆;
关键词
speech recognition;

机译：语音识别;

相似文献

外文文献
中文文献
专利

1. Semantic retrieval of video - review of research on video retrieval in meetings, movies and broadcast news, and sports [J] . Ziyou Xiong, Xiang Sean Zhou, Qi Tian, IEEE Signal Processing Magazine . 2006,第2期

机译：视频的语义检索-会议，电影和广播新闻以及体育中的视频检索研究综述
2. SPEECH SHOT EXTRACTION FROM BROADCAST NEWS VIDEOS [J] . SHOGO KUMAGAI, KEISUKE DOMAN, TOMOKAZU TAKAHASHI, International journal of semantic computing . 2012,第2期

机译：从广播新闻视频中提取语音
3. Online Speech Detection and Dual-Gender Speech Recognition for Captioning Broadcast News [J] . NHK技研 R&D . 2009,第114期

机译：字幕新闻的在线语音检测和双性别语音识别
4. Multi-modal information retrieval from broadcast video using OCR and speech recognition [C] . Alexander G. Hauptmann, Rong Jin, Tobun Dorbin Ng ACM/IEEE-CS joint conference on Digital libraries . 2002

机译：使用OCR和语音识别从广播视频中检索多模态信息
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning [O] . Dong Liu, Zhiyong Wang, Lifeng Wang, 2021

机译：基于深度学习的语音表达多模态融合情绪识别方法
7. Multi-modal Information Retrieval from Broadcast Video using OCR and Speech Recognition [O] . Hauptmann Alexander, Jin Rong, Ng Tobun Dorbin 2002

机译：使用OCR和语音识别从广播视频中检索多模式信息

Multi-modal information retrieval from broadcast video using OCR and speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅