首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >A PROSODY-BASED APPROACH TO END-OF-UTTERANCE DETECTION THAT DOES NOT REQUIRE SPEECH RECOGNITION
【24h】

A PROSODY-BASED APPROACH TO END-OF-UTTERANCE DETECTION THAT DOES NOT REQUIRE SPEECH RECOGNITION

机译:基于韵律的话语终止检测方法,不需要语音识别

获取原文

摘要

In previous work we showed that state-of-the-art end-of-utterance detection (as used, for example, in dialog systems) can be improved significantly by making use of prosodic and/or language models that predict utterance endpoints, based on word and alignment output from a speech recognizer. However, using a recognizer in endpointing might not be practical in certain applications. In this paper we demonstrate that the improvements due to the prosodic knowledge can be realized largely without alignment information, i.e., without requiring a speech recognizer. A prosodic end-of-utterance detector using only speech/nonspeech detection output is still considerably more accurate and has lower latency than a baseline system based on pause-length thresholding.
机译:在以前的工作中,我们展示了最先进的话语末端检测(例如,在对话系统中使用,通过使用预测话语终点的韵律和/或语言模型,可以显着提高关于语音识别器的单词和对齐输出。但是,在某些应用中,使用识别器可能不会实用。在本文中,我们证明了由于韵律知识而导致的改进可以在很大程度上在没有对齐信息的情况下实现,即,不需要语音识别器。仅使用语音/非静音检测输出的韵律末端探测器仍然比基于暂停长度阈值化的基线系统更准确并且具有更低的延迟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号