首页> 外文期刊>ETRI journal >Fast speaker adaptation using extended diagonal linear transformation for deep neural networks
【24h】

Fast speaker adaptation using extended diagonal linear transformation for deep neural networks

机译:使用扩展对角线性变换的深度神经网络快速说话人自适应

获取原文
           

摘要

This paper explores new techniques that are based on a hidden‐layer linear transformation for fast speaker adaptation used in deep neural networks ( DNN s). Conventional methods using affine transformations are ineffective because they require a relatively large number of parameters to perform. Meanwhile, methods that employ singular‐value decomposition ( SVD ) are utilized because they are effective at reducing adaptive parameters. However, a matrix decomposition is computationally expensive when using online services. We propose the use of an extended diagonal linear transformation method to minimize adaptation parameters without SVD to increase the performance level for tasks that require smaller degrees of adaptation. In Korean large vocabulary continuous speech recognition ( LVCSR ) tasks, the proposed method shows significant improvements with error‐reduction rates of 8.4% and 17.1% in five and 50 conversational sentence adaptations, respectively. Compared with the adaptation methods using SVD , there is an increased recognition performance with fewer parameters.
机译:本文探索了基于隐层线性变换的新技术,该技术可用于深度神经网络(DNN)中的快速说话人自适应。使用仿射变换的常规方法无效,因为它们需要相对大量的参数来执行。同时,使用了采用奇异值分解(SVD)的方法,因为它们有效地减少了自适应参数。但是,使用在线服务时,矩阵分解的计算量很大。我们建议使用扩展对角线线性变换方法来在不使用SVD的情况下最小化适应参数,以提高需要较小适应度的任务的性能水平。在韩国的大词汇量连续语音识别(LVCSR)任务中,所提出的方法显示出显着的改进,在五个和50个会话句子改编中,错误减少率分别为8.4%和17.1%。与使用SVD的自适应方法相比,具有更少参数的识别性能得以提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号