首页> 外文期刊>IEEE transactions on audio, speech and language processing >Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition
【24h】

Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition

机译:麦克风阵列语音识别的紧密耦合阵列处理和基于模型的补偿

获取原文
获取原文并翻译 | 示例

摘要

In conventional microphone array speech recognition, the array processor and the speech recognizer are loosely coupled. The only connection between the two modules is the enhanced target signal output from the array processor, which then gets treated as a single input to the recognizer. In this approach, useful environmental information, which can be provided by the array processor and also needs to be exploited by the recognizer, is ignored. Inherently, the array processor can generate multiple outputs of spatially filtered signals, as a multi-input-multi-output (MIMO) module. In this paper, a closely coupled approach is proposed, in which a recognizer with model-based noise compensation exploits the reference noise outputs from a MIMO array processor. Specifically, a multichannel model-based noise compensation is presented, including the compensation procedure using the vector Taylor series (VTS) expansion and parameter estimation using the expectation-maximization (EM) algorithm. It is also shown how to construct MIMO array processors from conventional beamformers. A number of practical implementations of the conventional loosely coupled approach and the proposed closely coupled approach were tested on a publicly available database, the Multichannel Overlapping Number Corpus (MONC). Experimental results showed that the proposed closely coupled approach significantly improved the speech recognition performance in the overlapping speech situations
机译:在传统的麦克风阵列语音识别中,阵列处理器和语音识别器是松散耦合的。这两个模块之间的唯一连接是从阵列处理器输出的增强目标信号,然后将其视为识别器的单个输入。在这种方法中,可以由阵列处理器提供并且也需要识别器利用的有用环境信息被忽略。作为一个多输入多输出(MIMO)模块,阵列处理器可以固有地生成空间滤波信号的多个输出。在本文中,提出了一种紧密耦合的方法,其中具有基于模型的噪声补偿的识别器利用MIMO阵列处理器的参考噪声输出。具体而言,提出了一种基于多通道模型的噪声补偿,包括使用矢量泰勒级数(VTS)展开的补偿过程和使用期望最大化(EM)算法的参数估计。还显示了如何从常规波束形成器构建MIMO阵列处理器。在公开可用的数据库多通道重叠数字语料库(MONC)上测试了常规松散耦合方法和建议的紧密耦合方法的许多实际实现。实验结果表明,在语音重叠的情况下,所提出的紧密耦合方法显着提高了语音识别性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号