首页> 外文会议>IEEE Workshop on Automatic Speech Recognition and Understanding >An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework
【24h】

An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework

机译:一种信息融合方法,以识别基于深度学习框架的Chime-3挑战中麦克风阵列语音的信息融合方法

获取原文

摘要

We present an information fusion approach to robust recognition of microphone array speech for the recently launched 3rd CHiME Challenge. It is based on a deep learning framework with a large neural network consisting of subnets with different architectures. Multiple knowledge sources are integrated via an early fusion of normalized noisy features with different beamforming techniques, speech enhanced features, speaker related features, and other auxiliary features concatenated as the input to each subnet, and a late fusion by combining the outputs of all subnets to produce one single output set. Our experiments demonstrate that all information sources are complementary in our proposed framework. Our best system achieves an average word error rate reduction of 68% from the officially released baseline results on the test set of real data.
机译:我们展示了一个信息融合方法,以强大地识别麦克风阵列语音最近推出的第三个唱片挑战。它基于具有大型神经网络的深度学习框架,由具有不同架构的子网组成。通过具有不同的波束成形技术,语音增强功能,扬声器相关特征的正常化噪声功能的早期融合集成了多个知识源,以及作为每个子网的输入连接的其他辅助功能,并通过将所有子网的输出组合到每个子网来融合产生一个单个输出集。我们的实验表明,我们提出的框架中的所有信息来源都是互补的。我们的最佳系统从正式发布的基线导致测试集的实际数据的基准结果实现了68%的平均字错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号