A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

Yoo Rhee OH; Hong Kook KIM

首页> 外文期刊>IEICE Transactions on Information and Systems >A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

【24h】

A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

机译：非母语语音识别的混合声学模型和语音模型自适应方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a hybrid model adaptation approach in which pronunciation and acoustic models are adapted by incorporating the pronunciation and acoustic variabilities of non-native speech in order to improve the performance of non-native automatic speech recognition (ASR). Specifically, the proposed hybrid model adaptation can be performed at either the state-tying or triphone-modeling level, depending at which acoustic model adaptation is performed. In both methods, we first analyze the pronunciation variant rules of non-native speakers and then classify each rule as either a pronunciation variant or an acoustic variant. The state-tying level hybrid method then adapts pronunciation models and acoustic models by accommodating the pronunciation variants in the pronunciation dictionary and by clustering the states of triphone acoustic models using the acoustic variants, respectively. On the other hand, the triphone-modeling level hybrid method initially adapts pronunciation models in the same way as in the state-tying level hybrid method; however, for the acoustic model adaptation, the triphone acoustic models are then re-estimated based on the adapted pronunciation models and the states of the re-estimated triphone acoustic models are clustered using the acoustic variants. From the Korean-spoken English speech recognition experiments, it is shown that ASR systems employing the state-tying and triphone-modeling level adaptation methods can relatively reduce the average word error rates (WERs) by 17.1% and 22.1% for non-native speech, respectively, when compared to a baseline ASR system.

机译：在本文中，我们提出了一种混合模型自适应方法，其中语音和声学模型通过结合非本地语音的语音和声学变化来进行自适应，以提高非本地自动语音识别（ASR）的性能。具体而言，可以在状态绑定或三音机建模级别上执行建议的混合模型自适应，具体取决于执行哪种声学模型自适应。在这两种方法中，我们首先分析非母语使用者的发音变体规则，然后将每个规则分类为发音变体或声学变体。然后，状态绑定级混合方法通过在发音词典中容纳发音变体并通过分别使用声学变体对三音机声学模型的状态进行聚类来调整语音模型和声学模型。另一方面，三音模拟水平混合方法最初以与状态绑定水平混合方法相同的方式来适配语音模型。然而，对于声学模型适应，然后基于适配的发音模型来重新估计三音器声学模型，并且使用声学变体对重新估计的三音器声学模型的状态进行聚类。从韩语口语语音识别实验中可以看出，采用状态绑定和三音单元建模水平自适应方法的ASR系统可以将非母语语音的平均单词错误率（WER）分别降低17.1％和22.1％与基准ASR系统相比。

著录项

来源
《IEICE Transactions on Information and Systems》 |2010年第9期|P.2379-2387|共9页
作者
Yoo Rhee OH; Hong Kook KIM;
展开▼
作者单位

School of Information and Communications, Gwangju Institute of Science and Technology (GIST), 1 Oryong-dong, Buk-gu, Gwangju 500-712, Korea;

rnSchool of Information and Communications, Gwangju Institute of Science and Technology (GIST), 1 Oryong-dong, Buk-gu, Gwangju 500-712, Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
non-native speech recognition; pronunciation variability; acoustic model adaptation; pronunciation model adaptation; state-tying level hybrid adaptation; triphone-modeling level hybrid adaptation;

机译：非本地语音识别;发音变化声学模型适应;发音模型适应;状态绑定级混合适应;三音器建模级混合自适应;

相似文献

外文文献
中文文献
专利

1. A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition [J] . Yoo Rhee OH, Hong Kook KIM IEICE transactions on information and systems . 2010,第9期

机译：非母语语音识别的混合声学和发音模型自适应方法
2. Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition [J] . Yoo Rhee Oh, Jae Sam Yoon, Hong Kook Kim Speech Communication . 2007,第1期

机译：基于语音变异性分析的声学模型自适应用于非母语语音识别
3. Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling [J] . G. Bouselmi, D. Fohr, I. Illina International journal of speech technology . 2012,第2期

机译：使用声学模型转换和语音建模对非母语语音进行多语言识别
4. A hybrid approach to adapting acoustic and pronunciation models for non-native speech recognition [C] . Oh Yoo Rhee, Kim Hong Kook Asilomar Conference on Signals, Systems and Computers . 2009

机译：混合声学和发音模型用于非本地语音识别的混合方法
5. Acoustic model and pronunciation adaptation in automatic speech recognition [D] . Zhang, Yongxin 2006

机译：自动语音识别中的声学模型和语音自适应
6. Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels [O] . Santiago-Omar Caballero-Morales 2013

机译：墨西哥西班牙语语音中的情绪识别：一种基于情绪特定元音声学模型的方法
7. Acoustic Pronunciation Variations Modeling for Standard Malay Speech Recognition [O] . Kamaruzaman Jusoff, Noraini Seman 2009

机译：用于标准马来语语音识别的原声发音变化建模

A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅