A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training

Xiaojun Qian; Helen Meng; Frank Soong

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training

【24h】

A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training

机译：计算机辅助语音训练的双误检测和诊断的两阶段框架

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents a two-pass framework with discriminative acoustic modeling for mispronunciation detection and diagnoses (MD&D). The first pass of mispronunciation detection does not require explicit phonetic error pattern modeling. The framework instantiates a set of antiphones and a filler model to augment the original phone model for each canonical phone. This guarantees full coverage of all possible error patterns while maximally exploiting the phonetic information derived from the text prompt. The antiphones can be used to detect substitutions. The filler model can detect insertions, and phone skips are allowed to detect deletions. As such, there is no prior assumption on the possible error patterns that can occur. The second pass of mispronunciation diagnosis expands the detected insertions and substitutions into phone networks, and another recognition pass attempts to reveal the phonetic identities of the detected mispronunciation errors. Discriminative training (DT) is applied respectively to the acoustic models of the mispronunciation detection pass and the mispronunciation diagnosis pass. DT effectively separates the acoustic models of the canonical phones and the antiphones. Overall, with DT in both passes of MD&D, the error rate is reduced by 40.4% relative, compared with the maximum likelihood baseline. After DT, the error rates of the respective passes are also lower than those of a strong single-pass baseline with DT by 1.3% and 5.1% relative which are statistically significant.

机译：本文提出了具有判别声学模型的两遍框架，用于错误发音检测和诊断（MD＆D）。错误发音检测的第一遍不需要显式的语音错误模式建模。该框架实例化了一组反电话和填充模型，以增强每个规范电话的原始电话模型。这样可以最大程度地利用从文本提示中获得的语音信息，从而完全覆盖所有可能的错误模式。消音器可用于检测替代。填充器模型可以检测到插入，并且允许电话跳过来检测删除。因此，对于可能发生的错误模式没有事先的假设。错误发音诊断的第二遍将检测到的插入和替换扩展到电话网络中，并且另一遍识别尝试尝试揭示检测到的错误发音错误的语音标识。区分训练（DT）分别应用于错读检测遍和错读诊断遍的声学模型。 DT有效地分离了规范电话和反电话的声学模型。总体而言，与最大似然基准相比，在MD＆D的两次通过中均使用DT，相对的错误率降低了40.4％。 DT之后，各遍的错误率也比具有DT的强单遍基线的错误率低1.3％和5.1％，这在统计上是显着的。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2016年第6期|1020-1028|共9页
作者
Xiaojun Qian; Helen Meng; Frank Soong;
展开▼
作者单位

Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computer-aided pronunciation training; computer-aided pronunciation training; discriminative training; mispronunciation detection and diagnosis;

机译：计算机辅助语音训练;计算机辅助语音训练;判别训练;发音错误的检测与诊断;

相似文献

外文文献
中文文献
专利

1. A transfer learning approach to goodness of pronunciation based automatic mispronunciation detection [J] . Huang Hao, Xu Haihua, Hu Ying, The Journal of the Acoustical Society of America . 2017,第5期

机译：基于语音自动误用检测的良善的转移学习方法
2. Pronunciation Variants Prediction Method to Detect Mispronunciations by Korean Learners of English [J] . JEESOO BANG, JONGHOON LEE, GARY GEUNBAE LEE, ACM transactions on Asian language information processing . 2014,第4期

机译：用于检测韩语英语学习者误音的发音变体预测方法
3. A Recursive Dialogue Game for Personalized Computer-Aided Pronunciation Training [J] . Su P.-h., Wu C.-h., Lee L.-s. Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第1期

机译：个性化计算机辅助语音训练的递归对话游戏
4. The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training [C] . Xiaojun Qian, Helen Meng, Frank Soong Annual conference of the International Speech Communication Association . 2012

机译：使用DBN-HMM在L2英语中进行误音检测和诊断以支持计算机辅助发音训练
5. Mobile Computer-Aided Diagnosis (CAD) for Breast Cancer Detection [D] . Shi, Jiaqiao. 2021

机译：乳腺癌检测的移动计算机辅助诊断（CAD）
6. Quality assurance and training procedures for computer-aided detection and diagnosis systems in clinical use [O] . Zhimin Huo, Ronald M. Summers, Sophie Paquerault, -1

机译：临床使用的计算机辅助检测和诊断系统的质量保证和培训程序
7. Capturing L2 Segmental Mispronunciations with Joint-sequence Models in Computer-Aided Pronunciation Training (CAPT) [O] . Xiaojun Qian, Helen Meng, Frank Soong 2013

机译：在计算机辅助语音训练（CAPT）中使用联合序列模型捕获L2节段性失语

A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅