首页> 外文会议>2018 IEEE Spoken Language Technology Workshop >Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification
【24h】

Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification

机译:研究基于字素的声学建模对口语说话者分类的下游影响

获取原文
获取原文并翻译 | 示例

摘要

Automatic speech recognition (ASR) and natural language understanding are critical components of spoken language understanding (SLU) systems. One obstacle to providing services with SLU systems in multiple languages is the cost associated with acquiring all of the language-specific resources required for ASR in each language. Modeling graphemes eliminates the need to obtain a pronunciation dictionary which maps from speech sounds to words and is one way to reduce ASR resource dependencies when rapidly developing ASR in new languages. However, little is known about the downstream impact on SLU task performance when selecting graphemes as the acoustic modeling unit. This work investigates acoustic modeling for the ASR component of an SLU system using grapheme-based approaches together with convolutional and recurrent neural network architectures. We evaluate both ASR word accuracy and spoken utterance classification (SUC) accuracy for English, Italian and Spanish language tasks and find that it is possible to achieve SUC accuracy that is comparable to conventional phoneme-based systems which leverage a pronunciation dictionary.
机译:自动语音识别(ASR)和自然语言理解是口头语言理解(SLU)系统的关键组成部分。用多种语言为SLU系统提供服务的一个障碍是与获取每种语言的ASR所需的所有特定于语言的资源相关的成本。通过对字素进行建模,无需获得从语音到单词的映射的发音词典,这是在以新语言快速开发ASR时减少ASR资源依赖性的一种方法。但是,在选择字素作为声学建模单元时,对于SLU任务性能的下游影响知之甚少。这项工作使用基于音素的方法以及卷积和递归神经网络体系结构,研究了SLU系统ASR组件的声学建模。我们评估了英语,意大利语和西班牙语语言任务的ASR单词准确度和口语分类(SUC)准确度,发现可以达到与利用语音词典的传统基于音素的系统相当的SUC准确度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号