首页> 外文会议> >Collaborative steering of microphone array and video camera toward multi-lingual tele-conference through speech-to-speech translation
【24h】

Collaborative steering of microphone array and video camera toward multi-lingual tele-conference through speech-to-speech translation

机译:通过语音到语音翻译将麦克风阵列和视频摄像机协同转向多语言电话会议

获取原文

摘要

It is very important for multilingual teleconferencing through speech-to-speech translation to capture distant-talking speech with high quality. In addition, the speaker image is also needed to realize a natural communication in such a conference. A microphone array is an ideal candidate for capturing distant-talking speech. Uttered speech can be enhanced and speaker images can be captured by steering a microphone array and a video camera in the speaker direction. However, to realize automatic steering, it is necessary to localize the talker. To overcome this problem, we propose collaborative steering of the microphone array and the video camera in real-time for a multilingual teleconference through speech-to-speech translation. We conducted experiments in a real room environment. The speaker localization rate (i.e., speaker image capturing rate) was 97.7%, speech recognition rate was 90.0%, and TOEIC score was 530/spl sim/540 points, subject to locating the speaker at a 2.0 meter distance from the microphone array.
机译:通过语音到语言翻译,多语言电话会议非常重要,以捕捉高质量的遥远谈话的言论。此外,还需要扬声器图像来实现这样的会议中的自然通信。麦克风阵列是捕获遥远谈话的语音的理想候选者。可以通过在扬声器方向上转向麦克风阵列和摄像机来捕获发音语音。但是,为了实现自动转向,有必要本地化讲话者。为了克服这个问题,我们通过语音到语音翻译提出了实时对麦克风阵列和摄像机的协同转向。我们在真正的房间环境中进行了实验。扬声器定位率(即扬声器图像捕获率)为97.7%,语音识别率为90.0%,脚趾分数为530 / SPL SIM / 540点,以便以2.0米从麦克风阵列距离的2.0米距离定位扬声器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号