OPTIMIZING MULTIPLE PRONUNCIATION DICTIONARY BASED ON A CONFUSABILITY MEASURE FOR NON-NATIVE SPEECH RECOGNITION

机译：基于非本机语音识别的可混合性测量，优化多个发音词典

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses issues associated with an efficient pronunciation variation modeling for non-native automatic speech recognition (ASR), where non-native speech is mostly characterized by different pronunciation from native speech. In order to improve the performance of non-native ASR, a multiple pronunciation dictionary using an indirect data-driven approach is first proposed. However, this approach results in an increased search space for ASR decoding due to the increase of the dictionary size. Therefore, we propose a method for optimizing the size of the multiple pronunciation dictionary by removing some confusable pronunciation variants in the dictionary. To this end, a confusability measure is also proposed here based on the Levenshtein distance between two different pronunciation variants. In addition, the number of phonemes for each pronunciation variant is used to optimize the dictionary size. To investigate the effect of the proposed approach on ASR performance, English is selected as a target language and English utterances spoken by Koreans are considered as non-native speech. It is shown from the continuous non-native ASR experiments that the ASR system using the optimized multiple pronunciation dictionary can achieve the average word error rate reduction by 13.53% with less computational complexity by 21.10% relatively, compared with that using the multiple pronunciation dictionary without optimization.

机译：本文解决了与非本机自动语音识别（ASR）的有效发音变化建模相关的问题，其中非本机语音主要是由来自本机语音不同的发音。为了提高非本机ASR的性能，首先提出使用间接数据驱动方法的多个发音词典。然而，由于字典大小的增加，这种方法导致ASR解码的搜索空间增加。因此，我们提出了一种用于通过删除字典中的一些可变的发音变量来优化多个发音词典的大小。为此，此处还基于两个不同的发音变体之间的Levenshtein距离提出了可混淆的测量。另外，每个发音变量的音素数用于优化字典大小。为了调查所提出的方法对ASR性能的影响，选择英语作为目标语言，韩国人所说的英语话语被视为非原生演讲。从连续的非本机ASR实验中显示，使用优化的多个发音词典的ASR系统可以实现13.53％的平均字错误率，而使用多个发音字典的计算复杂度较少的计算复杂性相对较少，而没有优化。

著录项

来源
《IASTED international conference on artificial intelligence and applications》|2008年||共6页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
non-native speech recognition; pronunciation variation modeling; multiple pronunciation dictionary; confusability measure;

机译：非原生语音识别;发音变化建模;多个发音词典;可混合性措施;

相似文献

外文文献
中文文献
专利

1. Phonological feature-based speech recognition system for pronunciation training in non-native language learning [J] . Arora Vipul, Lahiri Aditi, Reetz Henning The Journal of the Acoustical Society of America . 2018,第1期

机译：基于语音特征的语音识别系统，用于非母语学习中的发音培训
2. Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition [J] . Yoo Rhee Oh, Jae Sam Yoon, Hong Kook Kim Speech Communication . 2007,第1期

机译：基于语音变异性分析的声学模型自适应用于非母语语音识别
3. Optimization of dictionary and model library for recognition of speech commands based on cross-correlation portraits [J] . Krasheninnikov V.R., Krasheninnikova N.A., Kuznetsov V.V., Pattern recognition and image analysis: advances in mathematical theory and applications in the USSR . 2013,第1期

机译：基于互相关肖像的字典和模型库优化，用于语音命令识别
4. OPTIMIZING MULTIPLE PRONUNCIATION DICTIONARY BASED ON A CONFUSABILITY MEASURE FOR NON-NATIVE SPEECH RECOGNITION [C] . IASTED international conference on artificial intelligence and applications . 2008

机译：基于非本机语音识别的可混合性测量，优化多个发音词典
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. Influence of Native and Non-Native Multitalker Babble on Speech Recognition in Noise [O] . Chandni Jain, Sreeraj Konadath, Bharathi M. Vimal, 2014

机译：本地和非本地多方通话者的Ba语对噪声中语音识别的影响
7. MANDARIN SPEECH RECOGNITION FOR NONNATIVE SPEAKERS BASED ON PRONUNCIATION DICTIONARY ADAPTATION [O] . Jian Yang, Peishan Wu, Dan Xu 2013

机译：基于语音自适应的非语言语音识别语音识别

OPTIMIZING MULTIPLE PRONUNCIATION DICTIONARY BASED ON A CONFUSABILITY MEASURE FOR NON-NATIVE SPEECH RECOGNITION

摘要

著录项

相似文献

相关主题

期刊订阅