首页> 外文会议> >Robust Distant Speech Recognition by Combining Position-Dependent CMN with Conventional CMN
【24h】

Robust Distant Speech Recognition by Combining Position-Dependent CMN with Conventional CMN

机译:通过结合位置相关的CMN和常规CMN进行鲁棒的远程语音识别

获取原文

摘要

We proposed an environmentally robust speech recognition method based on position-dependent cepstral mean normalization (PD-CMN) to compensate for channel distortion depending on speaker position. PDCMN can efficiently compensate for the channel transmission characteristics while it cannot normalize speaker variation because position-dependent cepstral mean does not contain speaker characteristics. Conventional CMN can compensate for the speaker variation while it cannot obtain good recognition performance for short utterances. In this paper, we propose a robust distant speech recognition by combining position-dependent CMN with the conventional CMN to address the above problems. The position-dependent cepstral mean is linearly combined with conventional cepstral mean with following two types of processing. The first method is to use a fixed weighting coefficient over whole test data to obtain the combinational CMN, which is called fixed-weight combinational CMN. The second method is to calculate the output probability of multiple features compensated by a variable weighting coefficient at each frame, and a single decoder using these output probabilities is used to perform speech recognition, which is called variable-weight combinational CMN. We conducted the experiments of our proposed method using small vocabulary (100 words) distant isolated word recognition in a real environment. The proposed variable-weight combinational CMN method achieved a relative error reduction rate of 56.3% from conventional CMN and 22.2% from PDCMN, respectively
机译:我们提出了一种基于位置依赖的倒谱均值归一化(PD-CMN)的环境稳健的语音识别方法,以补偿取决于扬声器位置的声道失真。 PDCMN可以有效补偿通道传输特性,但不能归一化说话人变化,因为与位置相关的倒谱均值不包含说话人特性。传统的CMN可以补偿说话人的变化,同时对于短话语无法获得良好的识别性能。在本文中,我们提出了一种鲁棒的远距离语音识别方法,它将位置相关的CMN与常规的CMN相结合,以解决上述问题。通过以下两种类型的处理,将位置依赖的倒谱平均值与常规倒谱平均值进行线性组合。第一种方法是对整个测试数据使用固定的加权系数以获得组合CMN,称为固定权重组合CMN。第二种方法是计算在每个帧处由可变权重系数补偿的多个特征的输出概率,并且使用使用这些输出概率的单个解码器执行语音识别,这称为可变权重组合CMN。我们在实际环境中使用小词汇量(100个单词)远距离隔离单词识别进行了我们提出的方法的实验。所提出的可变权重组合CMN方法相对于传统CMN的相对误差降低率分别为56.3%和PDCMN的22.2%

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号