Lecture speech recognition considering the speaking rate variation

Kozo Okuda; Tatsuya Kawahara; Satoshi Nakamura

首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Lecture speech recognition considering the speaking rate variation

【24h】

Lecture speech recognition considering the speaking rate variation

机译：考虑语速变化的演讲语音识别

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In a lecture speech recognition, performance of speech recognition system degrades when a speaking rate is increased. The reason of this degradation is a change of acoustic characteristics not only in frequency domain but also in time domain. Because of these changes, normalization or compensation of the speaking rate is important. In this paper, we propose a speaking rate compensation method which selects an optimal frame period and frame length using a likelihood criterion. This method changes the frame period and length to compensate the speaking rate. However, the optimal frame period and length are different in each utterance. Therefore, our proposed method conducts speech recognition with various frame periods and lengths and determines the optimal frame period and length for the target speech using the acoustic likelihood normalized by the frame period and language likelihood. In a recognition experiment using CSJ corpus, this method improves the performance for high speaking rate speech.

机译：在演讲语音识别中，当语音速率增加时，语音识别系统的性能下降。这种劣化的原因是不仅在频域中而且在时域中声学特性的变化。由于这些变化，正常化或补偿语速很重要。在本文中，我们提出一种说话率补偿方法，该方法使用似然准则选择最佳帧周期和帧长。此方法更改帧周期和长度以补偿语速。但是，每种发声的最佳帧周期和长度都不同。因此，我们提出的方法以各种帧周期和长度进行语音识别，并使用由帧周期和语言似然性归一化的声学似然性来确定目标语音的最佳帧周期和长度。在使用CSJ语料库的识别实验中，此方法提高了高语速语音的性能。

著录项

来源
《電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication》 |2001年第521期|共6页
作者
Kozo Okuda; Tatsuya Kawahara; Satoshi Nakamura;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类通信;
关键词
Automatic speech recognition; Lecture speech; Speaking rate; Frame period; Frame length;

机译：自动语音识别;演讲语音;发声率;帧周期;帧长;

相似文献

外文文献
中文文献
专利

1. Lecture speech recognition considering the speaking rate variation [J] . Kozo Okuda, Tatsuya Kawahara, Satoshi Nakamura 電子情報通信学会技術研究報告. 音声. Speech . 2001,第523期

机译：考虑语速变化的演讲语音识别
2. Lecture speech recognition considering the speaking rate variation [J] . Kozo Okuda, Tatsuya Kawahara, Satoshi Nakamura 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第521期

机译：考虑说话率变化的讲座语音识别
3. Lecture speech recognition considering the speaking rate variation [J] . Kozo Okuda, Tatsuya Kawahara, Satoshi Nakamura 電子情報通信学会技術研究報告. 音声. Speech . 2001,第523期

机译：考虑说话率变化的讲座语音识别
4. Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition [C] . Nanjo, H., Kawahara, . 2002

机译：依赖于说话率的解码和自适应，用于自发演讲语音识别
5. Effects of equipment variations on speaker recognition error rates. [D] . Shaver, Clark D. 2009

机译：设备变化对说话人识别错误率的影响。
6. Effect of Speaking Rate on Recognition of Synthetic and Natural Speech by Normal-Hearing and Cochlear Implant Listeners [O] . Caili Ji, John J. Galvin III, Anting Xu, -1

机译：语速对正常听觉和人工耳蜗听众识别合成和自然语音的影响
7. Speaker Adaptation By Modeling The Speaker Variation In A Continuous Speech Recognition System [O] . Nikko Ström 2007

机译：通过建模连续语音识别系统中的说话人变异来调整说话人
8. Minimizing Speaker Variation Effects for Speaker-Independent Speech Recognition. [R] . Huang, X. 1992

机译：最小化扬声器变化效果以实现与扬声器无关的语音识别。

Lecture speech recognition considering the speaking rate variation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅