首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >End-to-end speech recognition for languages with ideographic characters
【24h】

End-to-end speech recognition for languages with ideographic characters

机译:以表征字符的语言结束于结束语音识别

获取原文

摘要

This paper describes a novel training method for acoustic models using connectionist temporal classification (CTC) for Japanese end-to-end automatic speech recognition (ASR). End-to-end ASR can estimate characters directly without using a pronunciation dictionary; however, this approach was conducted mostly in the English research area. When dealing with languages such as Japanese, we confront difficulties with robust acoustic modeling. One of the issues is caused by a large number of characters, including Japanese kanji, which leads to an increase in the number of model parameters. Additionally, multiple pronunciations of kanji increase the variance of acoustic features for corresponding characters. Therefore, we propose end-to-end ASR based on bi-directional long short-term memory (BLSTM) networks to solve these problems. Our proposal involves two approaches: reducing the number of dimensions of BLSTM and adding character strings to output layer labels. Dimensional compression decreases the number of parameters, while output label expansion reduces the variance of acoustic features. Consequently, we could obtain a robust model with a small number of parameters. Our experimental results with Japanese broadcast programs show the combined method of these two approaches improved the word error rate significantly compared with the conventional character-based end-to-end approach.
机译:本文介绍了使用连接员时间分类(CTC)用于日本端到端自动语音识别(ASR)的新颖培训方法。端到端ASR可以直接估计字符而不使用发音字典;然而,这种方法主要在英语研究区域进行。在处理日语如日语之类的语言时,我们将面对强大的声学建模。其中一个问题是由大量字符引起的,包括日本汉字,这导致模型参数的数量增加。此外,kanji的多个发音增加了对应字符的声学功能的方差。因此,我们提出基于双向长短期内存(BLSTM)网络的端到端ASR来解决这些问题。我们的提案涉及两种方法:减少BLSTM的尺寸并将字符串添加到输出层标签。尺寸压缩降低参数的数量,而输出标签扩展可降低声学特征的方差。因此,我们可以获得具有少量参数的强大模型。我们的日本广播节目的实验结果表明,与传统的基于特征的端到端方法相比,这两种方法的组合方法显着改善了字错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号