首页> 外文会议>International Conference on Pattern Recognition and Image Analysis >A Convolutional Neural Network model based on Neutrosophy for Noisy Speech Recognition
【24h】

A Convolutional Neural Network model based on Neutrosophy for Noisy Speech Recognition

机译:基于中智学的卷积神经网络模型用于噪声语音识别

获取原文

摘要

Convolutional neural networks are sensitive to unknown noisy condition in the test phase and so their performance degrades for the noisy data classification task including noisy speech recognition. In this research, a new convolutional neural network (CNN) model with data uncertainty handling; referred as NCNN (Neutrosophic Convolutional Neural Network); is proposed for classification task. Here, speech signals are used as input data and their noise is modeled as uncertainty. In this task, using speech spectrogram, a definition of uncertainty is proposed in neutrosophic (NS) domain. Uncertainty is computed for each Time-frequency point of speech spectrogram as like a pixel. Therefore, uncertainty matrix with the same size of spectrogram is created in NS domain. In the next step, a two parallel paths CNN classification model is proposed. Speech spectrogram is used as input of the first path and uncertainty matrix for the second path. The outputs of two paths are combined to compute the final output of the classifier. To show the effectiveness of the proposed method, it has been compared with conventional CNN on the isolated words of Aurora2 dataset. The proposed method achieves the average accuracy of 85.96 in noisy train data. It is more robust against noises with accuracies 90, 88 and 81 in test sets A, B and C, respectively. Results show that the proposed method outperforms conventional CNN with the improvement of 6, 5 and 2 percentage in test set A, test set B and test sets C, respectively. It means that the proposed method is more robust against noisy data and handle these data effectively.
机译:卷积神经网络在测试阶段对未知的嘈杂条件敏感,因此对于包括嘈杂语音识别在内的嘈杂数据分类任务,卷积神经网络的性能会下降。在这项研究中,一种新的具有数据不确定性处理的卷积神经网络(CNN)模型;称为NCNN(中性卷积神经网络);建议用于分类任务。在这里,语音信号被用作输入数据,其噪声被建模为不确定性。在此任务中,使用语音频谱图,在中智(NS)域中提出了不确定性的定义。对于语音频谱图的每个时频点,像像素一样计算不确定度。因此,在NS域中创建了具有相同频谱图大小的不确定性矩阵。在下一步中,提出了两个并行路径的CNN分类模型。语音频谱图用作第一路径的输入和第二路径的不确定性矩阵。合并两个路径的输出以计算分类器的最终输出。为了显示该方法的有效性,已将其与常规CNN在Aurora2数据集的孤立单词上进行了比较。所提出的方法在嘈杂的列车数据中达到了85.96的平均准确度。它在测试集A,B和C中的精度分别为90、88和81时更加强大。结果表明,所提出的方法在测试集A,测试集B和测试集C上分别比传统的CNN分别提高了6、5和2个百分点。这意味着所提出的方法对噪声数据更鲁棒,并且可以有效地处理这些数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号