首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Mispronunciation detection and diagnosis in l2 english speech using multi-distribution Deep Neural Networks
【24h】

Mispronunciation detection and diagnosis in l2 english speech using multi-distribution Deep Neural Networks

机译:多分布深度神经网络在英语英语语音中的误音检测与诊断

获取原文

摘要

This paper investigates the use of multi-distribution Deep Neural Networks (DNNs) for mispronunciation detection and diagnosis (MD&D). Our existing approach uses extended recognition networks (ERNs) to constrain the recognition paths to the canonical pronunciation of the target words and the likely phonetic mispronunciations. Although this approach is viable, it has some problems: (1) deriving appropriate phonological rules to generate the ERNs remains a challenging task; (2) the acoustic model (AM) and the phonological rules are trained independently and hence contextual information is lost; and (3) phones missing from the ERNs cannot be recognized even if we have a well-trained AM. Hence we propose an Acoustic Phonological Model (APM) using a multi-distribution DNN, whose input features include acoustic features and corresponding canonical pronunciations. The APM can implicitly learn the phonological rules from the canonical productions and annotated mispronunciations in the training data. Furthermore, the APM can also capture the relationships between the phonological rules and related acoustic features. As we do not restrict any pathways as in the ERNs, all phones can be recognized if we have a perfect APM. Experiments show that our method achieves an accuracy of 83.3% and a correctness of 88.5%. It significantly outperforms the approach of forced-alignment with ERNs whose correctness is 75.9%.
机译:本文研究了将多分布深度神经网络(DNN)用于发音错误的检测和诊断(MD&D)。我们现有的方法使用扩展识别网络(ERN)将识别路径限制为目标单词的规范发音和可能的语音错误发音。尽管这种方法可行,但存在一些问题:(1)推导适当的语音规则以生成ERN仍然是一项艰巨的任务; (2)声学模型(AM)和语音规则是独立训练的,因此会丢失上下文信息; (3)即使我们的AM训练有素,也无法识别ERN遗失的电话。因此,我们提出了使用多分布DNN的声学语音模型(APM),其输入特征包括声学特征和相应的规范发音。 APM可以从规范数据和训练数据中带注释的错误发音中隐式地学习语音规则。此外,APM还可以捕获语音规则与相关声学特征之间的关系。由于我们不像ERN那样限制任何途径,因此,如果我们拥有完善的APM,则所有电话都可以被识别。实验表明,我们的方法达到了83.3%的准确性和88.5%的正确性。它明显优于ERNs的强制对准方法,其正确率为75.9%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号