首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems
【24h】

Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems

机译:埃尔米特多项式用于连接主义语音识别系统的说话人适应

获取原文
获取原文并翻译 | 示例
           

摘要

Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any automatic speech recognition (ASR) system. This work addresses the problem of increased degradation in performance when moving from speaker-dependent (SD) to speaker-independent (SI) conditions for connectionist (or hybrid) hidden Markov model/artificial neural network (HMM/ANN) systems in the context of large vocabulary continuous speech recognition (LVCSR). Adapting hybrid HMM/ANN systems on a small amount of adaptation data has been proven to be a difficult task, and has been a limiting factor in the widespread deployment of hybrid techniques in operational ASR systems. Addressing the crucial issue of speaker adaptation (SA) for hybrid HMM/ANN system can thereby have a great impact on the connectionist paradigm, which will play a major role in the design of next-generation LVCSR considering the great success reported by deep neural networks—ANNs with many hidden layers that adopts the pre-training technique—on many speech tasks. Current adaptation techniques for ANNs based on injecting an adaptable linear transformation network connected to either the input, or the output layer are not effective especially with a small amount of adaptation data, e.g., a single adaptation utterance. In this paper, a novel solution is proposed to overcome those limits and make it robust to scarce adaptation resources. The key idea is to adapt the hidden activation functions rather than the network weights. The adoption of Hermitian activation functions makes this possible. Experimental results on an LVCSR task demonstrate the effectiveness of the proposed approach.
机译:模型自适应技术是减少通常在任何自动语音识别(ASR)系统的训练和测试条件之间发生的不匹配的有效方法。这项工作解决了在以下情况下,当连接者(或混合)隐马尔可夫模型/人工神经网络(HMM / ANN)系统从与说话者相关的(SD)状态变为与说话者无关的(SI)条件时,性能下降的问题。大词汇量连续语音识别(LVCSR)。事实证明,在少量适应数据上适应混合HMM / ANN系统是一项艰巨的任务,并且已成为在运营ASR系统中广泛使用混合技术的限制因素。因此,解决混合HMM / ANN系统的说话人自适应(SA)的关键问题可能会对连接主义范式产生重大影响,考虑到深度神经网络报告的巨大成功,这将在下一代LVCSR的设计中发挥重要作用在许多语音任务上,采用预训练技术的具有许多隐藏层的人工神经网络。基于注入连接到输入层或输出层的自适应线性变换网络的用于ANN的当前自适应技术尤其在使用少量自适应数据(例如单个自适应话语)的情况下无效。在本文中,提出了一种新颖的解决方案来克服这些限制并使其对稀缺的适应资源具有鲁棒性。关键思想是调整隐藏的激活功能而不是网络权重。采用Hermitian激活功能可以实现这一点。 LVCSR任务的实验结果证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号