首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Speech recognition robust against speech overlapping in monaural recordings of telephone conversations
【24h】

Speech recognition robust against speech overlapping in monaural recordings of telephone conversations

机译:语音识别功能强大,可防止电话对话的单声道录音中的语音重叠

获取原文

摘要

Monaural (single-channel) recording is sometimes used for telephone conversations in call centers. Generally speaking, the accuracy of automatic speech recognition of a monaural recording is worse than that of the multi-channel recording of the same conversation where each speaker's voice is separately recorded. The major reason is that the recognition system fails not only at the overlapping segments where the voices of the multiple speakers overlap, but also at the neighboring segments surrounding the overlapping segments. In this paper, we tackle this problem by using a combination of garbage modeling and noise-robust monaural acoustic modeling. Our proposed method trains the models by making use of multi-channel recordings and transcripts, which are relatively easy to prepare than monaural recordings and transcripts. We present experimental results where the proposed methods reduced the error rates by approximately 3% relative to the baseline methods for both of GMM-HMM and CNN-HMM cases. Because the proposed method is quite simple, the proposed method is easy to deploy to wide range of ASR systems for monaural speech transcription.
机译:单声道(单通道)记录有时用于呼叫中心中的电话交谈。一般而言,单声道录音的自动语音识别精度要比分别记录每个讲话者语音的同一对话的多通道录音的精度差。主要原因是识别系统不仅在多个说话者的声音重叠的重叠部分处失败,而且在重叠部分周围的相邻部分处也失败。在本文中,我们通过结合使用垃圾建模和噪声健壮的单声道声学模型来解决此问题。我们提出的方法通过利用多通道录音和转录本来训练模型,这比单声道录音和转录本相对容易制备。我们介绍了实验结果,其中针对GMM-HMM和CNN-HMM案例,所提出的方法相对于基线方法将错误率降低了约3%。因为所提出的方法非常简单,所以所提出的方法易于部署到范围广泛的用于单声道语音转录的ASR系统中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号