首页> 外文会议>Conference on empirical methods in natural language processing >Introspection for convolutional automatic speech recognition
【24h】

Introspection for convolutional automatic speech recognition

机译:卷积自动语音识别的内省

获取原文

摘要

Artificial Neural Networks (ANNs) have experienced great success in the past few years. The increasing complexity of these models leads to less understanding about their decision processes. Therefore, introspection techniques have been proposed, mostly for images as input data. Patterns or relevant regions in images can be intuitively interpreted by a human observer. This is not the case for more complex data like speech recordings. In this work, we investigate the application of common introspection techniques from computer vision to an Automatic Speech Recognition (ASR) task. To this end, we use a model similar to image classification, which predicts letters from spectrograms. We show difficulties in applying image introspection to ASR. To tackle these problems, we propose normalized averaging of aligned inputs (NAvAI): a data-driven method to reveal learned patterns for prediction of specific classes. Our method integrates information from many data examples through local introspection techniques for Convolutional Neural Networks (CNNs). We demonstrate that our method provides better interpretability of letter-specific patterns than existing methods.
机译:人工神经网络(ANNS)在过去几年中取得了巨大的成功。这些模型的越来越复杂程度导致对他们的决策过程的理解不太了解。因此,已经提出了内省技术,主要用于图像作为输入数据。图像中的模式或相关区域可以通过人类观察者直观地解释。对于更复杂的数据,这不是语音记录的情况并非如此。在这项工作中,我们调查了常见的内省技术从计算机视觉到自动语音识别(ASR)任务的应用。为此,我们使用类似于图像分类的模型,该模型预测来自频谱图的字母。我们在将图像内省应用于ASR时,我们展示了困难。为了解决这些问题,我们提出了对齐输入的归一化平均值(Navai):数据驱动方法,以揭示用于预测特定类的学习模式。我们的方法通过卷积神经网络(CNNS)的本地内省技术集成了许多数据示例的信息。我们证明我们的方法提供了比现有方法更好地解释信函的模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号