Fusing audio and video information toward detection of speech events under real environments

Takashi Yoshimura; Futoshi Asano; Youichi Motomura; Hideki Asoh; Naoyuki Ichimura; Kiyoshi Yamamoto; Satoshi Nakamura

首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Fusing audio and video information toward detection of speech events under real environments

【24h】

Fusing audio and video information toward detection of speech events under real environments

机译：融合音频和视频信息以检测真实环境下的语音事件

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, a method of detecting and separating speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by a stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, the information on the time and location of speech events can be known in a multiple-sound-source condition. Based on the detected speech event information, a maximum likelihood adaptive beamformer is constructed and the speech signal is separated from background noises and interferences.

机译：本文提出了一种利用音频和视频信息在多声源条件下检测和分离语音事件的方法。为了检测语音事件，通过贝叶斯网络将使用麦克风阵列进行的声音定位和通过立体视觉进行的人体跟踪相结合。根据贝叶斯网络的推断结果，可以在多声源条件下获知有关语音事件的时间和位置的信息。基于检测到的语音事件信息，构造最大似然自适应波束形成器，并将语音信号与背景噪声和干扰分离。

著录项

来源
《電子情報通信学会技術研究報告. 音声. Speech》 |2003年第26期|共6页
作者
Takashi Yoshimura; Futoshi Asano; Youichi Motomura; Hideki Asoh; Naoyuki Ichimura; Kiyoshi Yamamoto; Satoshi Nakamura;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类电报、传真;
关键词
Sound localization; Human tracking; Information fusion; Bayesian network;

机译：声音定位;人类跟踪;信息融合;贝叶斯网络;

相似文献

外文文献
中文文献
专利

1. Fusing audio and video information toward detection of speech events under real environments [J] . Takashi Yoshimura, Futoshi Asano, Youichi Motomura, 電子情報通信学会技術研究報告. 音声. Speech . 2003,第26期

机译：融合音频和视频信息以检测真实环境下的语音事件
2. Fusing audio and video information toward detection of speech events under real environments [J] . Takashi Yoshimura, Futoshi Asano, Youichi Motomura, 電子情報通信学会技術研究報告. 応用音響. Engineering Acoustics . 2003,第24期

机译：在真实环境下融合音频和视频信息朝着检测语音事件的检测
3. Fusing audio and video information toward detection of speech events under real environments [J] . Takashi Yoshimura, Futoshi Asano, Youichi Motomura, 電子情報通信学会技術研究報告. 音声. Speech . 2003,第26期

机译：在真实环境下融合音频和视频信息朝着检测语音事件的检测
4. Speech retrieval for TV news programs by fusing the audio and video information [C] . Xinbo Gao, Jie Li, Hongbing Ji . 2002

机译：通过融合音频和视频信息来检索电视新闻节目的语音
5. Real-time surveillance system: Video, audio, and crowd detection [D] . Wu, Xinyu 2008

机译：实时监控系统：视频，音频和人群检测
6. homeSound: Real-Time Audio Event Detection Based on High Performance Computing for Behaviour and Surveillance Remote Monitoring [O] . Rosa Ma Alsina-Pagès, Joan Navarro, Francesc Alías, 2017

机译：homeSound：基于高性能计算的实时音频事件检测用于行为和监视远程监控
7. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface [O] . Futoshi Asano, Kiyoshi Yamamoto, Isao Hara, 2004

机译：音视频信息融合对语音事件的检测与分离及其在鲁棒语音接口中的应用

Fusing audio and video information toward detection of speech events under real environments

摘要

著录项

相似文献

相关主题

期刊订阅