首页> 外文会议>INTERSPEECH 2012 >Voice Activity Detection Using Speech Recognizer Feedback
【24h】

Voice Activity Detection Using Speech Recognizer Feedback

机译:语音活动检测使用语音识别器反馈

获取原文

摘要

This paper demonstrates how feedback from a speech recognizer can be leveraged to improve Voice Activity Detection (VAD) for online speech recognition. First, reliably transcribed segments of audio are fed back by the recognizer as supervision for VAD model adaptation. This allows the much stronger LVCSR acoustic models to be harnessed without adding computation. Second, when to make a VAD decision is dictated by the recognizer not the VAD module, allowing an implicit dynamic look-ahead for VAD. This improves robustness but can be gracefully reduced to meet latency requirements if necessary without requiring retraining/retuning of the VAD module. Experiments on telephone conversations yielded a 6.7% abs. reduction in frame classification error rate when feedback was applied to HMM-based VAD and a 4.2% abs. reduction over the best baseline system. Furthermore, a 3.0% abs. WER reduction was achieved over the best baseline in speech recognition experiments.
机译:本文演示了如何利用语音识别器的反馈,以改善在线语音识别的语音活动检测(VAD)。首先,通过识别器作为VAD模型适应的监督,可靠地转录的音频转换段。这允许在不添加计算的情况下利用更强大的LVCSR声学模型。其次,当识别器而不是VAD模块决定时,何时制作VAD决定,允许VAD的隐式动态寻找。这改善了稳健性,但如果需要,可以优雅地减少以满足延迟要求,而无需再培训/重新定量VAD模块。电话交谈的实验产生了6.7%ABS。当反馈应用于基于HMM的VAD和4.2%ABS时,帧分类错误率的降低。减少最好的基线系统。此外,ABS 3.0%。在语音识别实验中最好的基线实现了减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号