Prosody dependent speech recognition on American radio news speech.

机译：美国广播新闻语音中依赖于韵律的语音识别。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Prosody (the melody and rhythm of natural speech), although important for human speech recognition, has not been fully utilized in large vocabulary continuous speech recognition. In this dissertation, we propose a novel "prosody-dependent speech recognition" framework, in which word and prosody are recognized simultaneously for the purpose of improving word recognition accuracy. We review the linguistic literature on how prosody is used in human speech communication, what the structure and function of prosody is, and how prosody affects acoustic realization of the segmental and suprasegmental units of speech. We conduct information-theoretic analysis, proving that when prosody is modeled, better word recognition can be achieved through the interaction between the acoustic model and the language model. We conduct detailed experiments to determine the set of allophonic HMMs or probability distributions that are sensitive to prosody, under the guidance of linguistic prior knowledge and empirical selection rules. We measure the effects of prosody on the language modeling and on the pronunciation modeling, and propose a factored approach, which leverages the strong dependence of prosody over syntax, to improve the robustness of the N-gram language modeling. We also develop an automatic prosody labeling system as a way to reduce the human labeling cost based on a simplified version of the Tones and Break Indices system. In the word recognition experiment on the Boston University Radio News Corpus, we find that our system is able to reduce word error rate by as much as an absolute 11%, as compared with a conventional prosody-independent HMM-based automatic speech recognizer that has comparable parameter count. Our research confirms that explicit modeling of prosody in HMM based automatic speech recognizers can improve word recognition on American Radio News speech.

机译：韵律（自然语音的旋律和节奏）虽然对人类语音识别很重要，但在大词汇量连续语音识别中尚未得到充分利用。本文提出了一种新颖的“基于韵律的语音识别”框架，该框架可以同时识别单词和韵律，以提高单词识别的准确性。我们回顾一下关于语言韵律如何在人类语音交流中使用，语言韵律的结构和功能以及语言韵律如何影响分段和超节段性语音的声学实现的语言文献。我们进行信息理论分析，证明对韵律建模时，通过声学模型和语言模型之间的交互可以更好地识别单词。在语言先验知识和经验选择规则的指导下，我们进行了详细的实验，以确定对韵律敏感的等位HMM或概率分布的集合。我们测量韵律对语言模型和发音模型的影响，并提出一种因式分解方法，该方法利用韵律对语法的强烈依赖性来提高N-gram语言建模的鲁棒性。我们还开发了自动韵律标签系统，以此作为基于“音调和中断索引”系统的简化版本来减少人工标签成本的方法。在波士顿大学广播新闻语料库的单词识别实验中，我们发现，与传统的基于韵律的基于HMM的自动语音识别器相比，我们的系统能够将单词错误率降低多达11％可比较的参数计数。我们的研究证实，在基于HMM的自动语音识别器中对韵律进行显式建模可以改善“美国广播新闻”语音中的单词识别。

著录项

作者
Chen, Ken.;
展开▼
作者单位

University of Illinois at Urbana-Champaign.;

展开▼
授予单位 University of Illinois at Urbana-Champaign.;
学科 Engineering Electronics and Electrical.; Language Linguistics.; Computer Science.
学位 Ph.D.
年度 2004
页码 124 p.
总页数 124
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;语言学;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Prosody dependent speech recognition on radio news corpus of American English [J] . Chen K., Hasegawa-Johnson M., Cohen A., IEEE transactions on audio, speech and language processing . 2006,第1期

机译：美国英语广播新闻语料库中基于韵律的语音识别
2. Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus [J] . Mark Hasegawa-Johnson, Ken Chen, Jennifer Cole, Speech Communication . 2005,第3a4期

机译：波士顿大学广播语音语料库中单词和韵律的同时识别
3. The Principle of Distinctive and Contrastive Coherence of Prosody in Radio News: An Analysis of Perception and Recognition [J] . Rodero Emma Journal of nonverbal behavior . 2015,第1期

机译：广播新闻中韵律差异性和对比性连贯性原则：知觉与认知分析
4. Prosody dependent Mandarin speech recognition [C] . Ni Chong-Jia, Liu Wen-Ju, Xu Bo The 2011 International Joint Conference on Neural Networks . 2011

机译：依赖韵律的普通话语音识别
5. Prosody production and perception with conversational speech. [D] . Mo, Yoonsook. 2010

机译：韵律的产生和通过对话语音的感知。
6. Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition [O] . Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan -1

机译：类别韵律模型的无监督适应用于韵律标记和语音识别
7. Prosody dependent speech recognition on radio news corpus of American English [O] . Ken Chen, Mark Hasegawa-johnson, Senior Member, 2006

机译：美国英语广播新闻语料库中的韵律依赖语音识别

Prosody dependent speech recognition on American radio news speech.

摘要

著录项

相似文献

相关主题

期刊订阅