首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >SED-MDD: Towards Sentence Dependent End-To-End Mispronunciation Detection and Diagnosis
【24h】

SED-MDD: Towards Sentence Dependent End-To-End Mispronunciation Detection and Diagnosis

机译:SED-MDD:依赖于句子的端到端错误诊断和诊断

获取原文

摘要

A mispronunciation detection and diagnosis (MD&D) system typically consists of multiple stages, such as an acoustic model, a language model and a Viterbi decoder. In order to integrate these stages, we propose SED-MDD, an end-to-end model for sentence dependent mispronunciation detection and diagnosis (MD&D) . Our proposed model takes mel-spectrogram and characters as inputs and outputs the corresponding phone sequence. Our experiments prove that SED-MDD can implicitly learn the phonological rules in both acoustic and linguistic features directly from the phonological annotation and transcription in the training data. To the best of our knowledge, SED-MDD is the first model of its kind and it achieves an accuracy of 86.35% and a correctness of 88.61% on L2-ARCTIC which significantly outperforms the existing end-to-end mispronunciation detection and diagnosis (MD&D) model CNN-RNN-CTC.
机译:错误发音检测和诊断(MD&D)系统通常由多个阶段组成,例如声学模型,语言模型和维特比解码器。为了整合这些阶段,我们提出SED-MDD,用于句子依赖性错误发布检测和诊断的端到端模型(MD&D)。我们所提出的模型将MEL-谱图和字符作为输入输出并输出相应的电话序列。我们的实验证明,SED-MDD可以直接从训练数据中的语音注释和转录中含蓄地学习声学和语言特征中的语音规则。据我们所知,SED-MDD是它的第一个模型,它的准确性为86.35%,正确性为88.61%,在L2-arctic上具有88.61%,显着优于现有的端到最终的误用检测和诊断( MD&D)模型CNN-RNN-CTC。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号