首页> 外文学位 >Prosody dependent speech recognition on American radio news speech.
【24h】

Prosody dependent speech recognition on American radio news speech.

机译:美国广播新闻语音中依赖于韵律的语音识别。

获取原文
获取原文并翻译 | 示例

摘要

Prosody (the melody and rhythm of natural speech), although important for human speech recognition, has not been fully utilized in large vocabulary continuous speech recognition. In this dissertation, we propose a novel "prosody-dependent speech recognition" framework, in which word and prosody are recognized simultaneously for the purpose of improving word recognition accuracy. We review the linguistic literature on how prosody is used in human speech communication, what the structure and function of prosody is, and how prosody affects acoustic realization of the segmental and suprasegmental units of speech. We conduct information-theoretic analysis, proving that when prosody is modeled, better word recognition can be achieved through the interaction between the acoustic model and the language model. We conduct detailed experiments to determine the set of allophonic HMMs or probability distributions that are sensitive to prosody, under the guidance of linguistic prior knowledge and empirical selection rules. We measure the effects of prosody on the language modeling and on the pronunciation modeling, and propose a factored approach, which leverages the strong dependence of prosody over syntax, to improve the robustness of the N-gram language modeling. We also develop an automatic prosody labeling system as a way to reduce the human labeling cost based on a simplified version of the Tones and Break Indices system. In the word recognition experiment on the Boston University Radio News Corpus, we find that our system is able to reduce word error rate by as much as an absolute 11%, as compared with a conventional prosody-independent HMM-based automatic speech recognizer that has comparable parameter count. Our research confirms that explicit modeling of prosody in HMM based automatic speech recognizers can improve word recognition on American Radio News speech.
机译:韵律(自然语音的旋律和节奏)虽然对人类语音识别很重要,但在大词汇量连续语音识别中尚未得到充分利用。本文提出了一种新颖的“基于韵律的语音识别”框架,该框架可以同时识别单词和韵律,以提高单词识别的准确性。我们回顾一下关于语言韵律如何在人类语音交流中使用,语言韵律的结构和功能以及语言韵律如何影响分段和超节段性语音的声学实现的语言文献。我们进行信息理论分析,证明对韵律建模时,通过声学模型和语言模型之间的交互可以更好地识别单词。在语言先验知识和经验选择规则的指导下,我们进行了详细的实验,以确定对韵律敏感的等位HMM或概率分布的集合。我们测量韵律对语言模型和发音模型的影响,并提出一种因式分解方法,该方法利用韵律对语法的强烈依赖性来提高N-gram语言建模的鲁棒性。我们还开发了自动韵律标签系统,以此作为基于“音调和中断索引”系统的简化版本来减少人工标签成本的方法。在波士顿大学广播新闻语料库的单词识别实验中,我们发现,与传统的基于韵律的基于HMM的自动语音识别器相比,我们的系统能够将单词错误率降低多达11%可比较的参数计数。我们的研究证实,在基于HMM的自动语音识别器中对韵律进行显式建模可以改善“美国广播新闻”语音中的单词识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号