The recurrent neural network and the connectionist temporal classification algorithm are applied to the a-coustic modeling of Tibetan speech recognition ,so as to achieve end-to-end model training .According to the rela-tionship between the input and output of the acoustic model ,the time domain convolution operation on the output se-quence of the hidden layer is introduced to reduce the time domain expansion of the network's hidden layers .Experi-mental results show that the recurrent neural network model achieves better recognition performance in Tibetan Lha-sa phoneme recognition compared with the traditional acoustic models based on Hidden Markov Model ,while the a-coustic model based on recurrent neural network with time-domain convolution possesses higher training and deco-ding efficiency while maintaining the same recognition performance .%探索将循环神经网络和连接时序分类算法应用于藏语语音识别声学建模,实现端到端的模型训练.同时根据声学模型输入与输出的关系,通过在隐含层输出序列上引入时域卷积操作来对网络隐含层时域展开步数进行约简,从而有效提升模型的训练与解码效率.实验结果显示,与传统基于隐马尔可夫模型的声学建模方法相比,循环神经网络模型在藏语拉萨话音素识别任务上具有更好的识别性能,而引入时域卷积操作的循环神经网络声学模型在保持同等识别性能的情况下,拥有更高的训练和解码效率.
展开▼