首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP >Silence is golden: Modeling non-speech events in WFST-based dynamic network decoders
【24h】

Silence is golden: Modeling non-speech events in WFST-based dynamic network decoders

机译:沉默是黄金:在基于WFST的动态网络解码器中对非语音事件建模

获取原文

摘要

Models for silence are a fundamental part of continuous speech recognition systems. Depending on application requirements, audio data segmentation, and availability of detailed training data annotations, it may be necessary or beneficial to differentiate between other non-speech events, for example breath and background noise. The integration of multiple non-speech models in a WFST-based dynamic network decoder is not straightforward, because these models do not perfectly fit in the transducer framework. This paper describes several options for the transducer construction with multiple non-speech models, shows their considerable different characteristics in memory and runtime efficiency, and analyzes the impact on the recognition performance.
机译:沉默模型是连续语音识别系统的基本组成部分。根据应用程序要求,音频数据分段和详细训练数据注释的可用性,可能有必要或有必要在其他非语音事件之间进行区分,例如呼吸和背景噪音。在基于WFST的动态网络解码器中集成多个非语音模型并不是一件容易的事,因为这些模型不能完美地适合于换能器框架。本文介绍了具有多个非语音模型的换能器构造的几个选项,显示了它们在内存和运行时效率方面的显着不同特性,并分析了对识别性能的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号