Prosody for Mandarin Speech Recognition:a Comparative Study of Read and Spontaneous Speech

机译：普通话语音识别的韵律：阅读和自发言论的比较研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present a comparative study between sponta-neous speech and read Mandarin speech in the context of au-tomatic speech recognition. We focus on analysis and mod-eling of prosodic features, based on a unique speech corpus that contains similar amounts of read and spontaneous speech data from the same group of speakers. Statistical analysis is carried out on tone contours and duration of syllable and sub-syllable units. Speech recognition experiments are performed to evaluate the effectiveness of different approaches to incorpo-rate prosodic features into acoustic modeling. A key problem being addressed is how to deal with the unvoiced frames where FO values are unavailable. We apply the technique of Multi-space distribution (MSD) to model partially continuous FO con-tours. For spontaneous speech, the tonal-syllable error rate is reduced from the MFCC baseline of 64.8% to 59.4% with the MSD based prosody model. For read speech, the performance improves from 46.0% to 36.4%.

机译：在本文中，我们在AU-Tomatic语音识别的背景下展示了Sponta-Neoy言语和读普通话语音的比较研究。基于独特的语音语料库，我们专注于分析和Mod-Eling的韵律特征，这些语料库包含来自同一组扬声器的类似读取和自发语音数据。在音节和子音节单位的音调轮廓和持续时间内进行统计分析。进行语音识别实验，以评估不同方法对电流模型的不同方法的有效性。正在解决的关键问题是如何处理不可用的无人帧。我们将多个空间分布（MSD）的技术应用于模型部分连续的Con-Tour。对于自发的言论，色调音节误差率与MFCC基线减少了64.8％至59.4％，基于MSD基于MSD的韵律模型。对于阅读言论，性能从46.0％提高到36.4％。

著录项

来源
《International Speech Communication Association》|2008年||共4页
会议地点
作者
Yu Ting Yeung; Yao Qian; Tan Lee; Frank K. Soong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912.3-532;
关键词
Spontaneous speech recognition; Prosody; Mandarin;

机译：自发的语音识别;prosody;普通话;

相似文献

外文文献
中文文献
专利

1. Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance [J] . Masanobu Nakamura, Koji Iwano, Sadaoki Furui Computer speech and language . 2008,第2期

机译：自发和阅读语音的声学特性之间的差异及其对语音识别性能的影响
2. A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic Features of Speech Recognition In Noisy Environment [J] . Hajer Rahali, Noureddine Ellouze, Zied Hajaiej International Journal of Computer Science and Security . 2014,第2期

机译：比较研究：噪声环境中利用语音识别的韵律特征的Gammachirp小波和听觉滤波器
3. RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion [J] . Wern-Jun Wang, Yuan-Fu Liao, Sin-Horng Chen Speech Communication . 2002,第3a4期

机译：基于RNN的普通话韵律模型及其在语音到文本转换中的应用
4. Prosody for Mandarin Speech Recognition:a Comparative Study of Read and Spontaneous Speech [C] . Yu Ting Yeung, Yao Qian, Tan Lee, International Speech Communication Association . 2008

机译：普通话语音识别的韵律：阅读和自发言论的比较研究
5. Pronunciation modeling for spontaneous Mandarin speech recognition. [D] . Liu, Yi. 2002

机译：用于自发普通话语音识别的语音建模。
6. Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition [O] . Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan -1

机译：类别韵律模型的无监督适应用于韵律标记和语音识别
7. ‘Read speech normalization’ (RSN): a method to study prosodic variability in spontaneous speech [O] . Zipp, Lena, Dellwo, Volker 2011

机译：“阅读语音标准化”（RSN）：一种研究自发语音韵律变异性的方法

Prosody for Mandarin Speech Recognition:a Comparative Study of Read and Spontaneous Speech

摘要

著录项

相似文献

相关主题

期刊订阅