...
首页> 外文期刊>MATEC Web of Conferences >End-to-End Mandarin Recognition based on Convolution Input
【24h】

End-to-End Mandarin Recognition based on Convolution Input

机译:基于卷积输入的端到端普通话识别

获取原文
           

摘要

The cross-entropy criterion of mainstream neural network training is to classify and optimize each frame of acoustic data, while the continuous speech recognition uses the sequence-level transcription accuracy as a performance measure. In view of this difference, an end-to-end speech recognition system based on sequence level transcription is constructed in this paper. The model uses convolution neural network to deal with the input features, selects the best network structure, and performs two-dimensional convolution in the time and frequency domains. At the same time, neural network uses batch normalization technology to reduce generalization error and speed up training. Finally, the hyper-parameters in decoding process are optimized to improve the modelling effect. Experimental results show that the system performance is improved a lot, better than mainstream speech recognition systems.
机译:主流神经网络训练的交叉熵准则是对声学数据的每一帧进行分类和优化,而连续语音识别则使用序列级转录准确性作为性能指标。针对这种差异,本文构建了一种基于序列水平转录的端到端语音识别系统。该模型使用卷积神经网络处理输入特征,选择最佳网络结构,并在时域和频域执行二维卷积。同时,神经网络使用批量归一化技术来减少泛化错误并加快训练速度。最后,对解码过程中的超参数进行优化,以提高建模效果。实验结果表明,该系统的性能有很大提高,优于主流语音识别系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号