首页> 外文会议>International Conference on speech and computer >Using a High-Speed Video Camera for Robust Audio-Visual Speech Recognition in Acoustically Noisy Conditions
【24h】

Using a High-Speed Video Camera for Robust Audio-Visual Speech Recognition in Acoustically Noisy Conditions

机译:使用高速摄像机在声噪条件下进行可靠的视听语音识别

获取原文

摘要

The purpose of this study is to develop a robust audio-visual speech recognition system and to investigate the influence of a high-speed video data on the recognition accuracy of continuous Russian speech under different noisy conditions. Developed experimental setup and collected multimodal database allow us to explore the impact brought by the high-speed video recordings with various frames per second (fps) starting from standard 25 fps up to high-speed 200 fps. At the moment there is no research objectively reflecting the dependence of the speech recognition accuracy from the video frame rate. Also there are no relevant audio-visual databases for model training. In this paper, we try to fill in this gap for continuous Russian speech. Our evaluation experiments show the increase of absolute recognition accuracy up to 3% and prove that the use of the high-speed camera JAI Pulnix with 200 fps allows achieving better recognition results under different acoustically noisy conditions.
机译:这项研究的目的是开发一个强大的视听语音识别系统,并研究高速视频数据对不同噪声条件下连续俄罗斯语音的识别精度的影响。开发的实验设置和收集的多模式数据库使我们能够探索从标准25 fps到高速200 fps的各种每秒帧数(fps)的高速视频记录所带来的影响。目前,还没有研究客观地反映语音识别精度与视频帧速率之间的关系。也没有用于模型训练的相关视听数据库。在本文中,我们试图填补这一空白,以使俄罗斯人能够连续发表演讲。我们的评估实验表明,绝对识别精度提高了3%,并证明了使用200 fps的高速相机JAI Pulnix可以在不同的声学噪声条件下获得更好的识别结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号