首页> 外文期刊>Signal Processing Magazine, IEEE >Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors
【24h】

Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors

机译:远距离语音识别的麦克风阵列处理:从近距离麦克风到远场传感器

获取原文
获取原文并翻译 | 示例

摘要

Distant speech recognition (DSR) holds the promise of the most natural human computer interface because it enables man-machine interactions through speech, without the necessity of donning intrusive body- or head-mounted microphones. Recognizing distant speech robustly, however, remains a challenge. This contribution provides a tutorial overview of DSR systems based on microphone arrays. In particular, we present recent work on acoustic beam forming for DSR, along with experimental results verifying the effectiveness of the various algorithms described here; beginning from a word error rate (WER) of 14.3% with a single microphone of a linear array, our state-of-the-art DSR system achieved a WER of 5.3%, which was comparable to that of 4.2% obtained with a lapel microphone. Moreover, we present an emerging technology in the area of far-field audio and speech processing based on spherical microphone arrays. Performance comparisons of spherical and linear arrays reveal that a spherical array with a diameter of 8.4 cm can provide recognition accuracy comparable or better than that obtained with a large linear array with an aperture length of 126 cm.
机译:远距离语音识别(DSR)拥有最自然的人机交互界面,因为它可以通过语音实现人机交互,而无需佩戴侵入性的头戴式或头戴式麦克风。然而,如何可靠地识别远处的语音仍然是一个挑战。该文稿提供了基于麦克风阵列的DSR系统的教程概述。特别是,我们介绍了DSR声束形成的最新工作,以及验证了此处描述的各种算法的有效性的实验结果。从使用线性阵列的单个麦克风的14.3%的单词错误率(WER)开始,我们最先进的DSR系统实现了5.3%的WER,可与翻领获得的4.2%相比麦克风。此外,我们提出了一种基于球形麦克风阵列的远场音频和语音处理领域中的新兴技术。球形和线性阵列的性能比较显示,直径为8.4 cm的球形阵列可以提供比使用孔径长度为126 cm的大型线性阵列更高或更高的识别精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号