【24h】

Lexicon-Free Conversational Speech Recognition with Neural Networks

机译:神经网络的无词典对话语音识别

获取原文

摘要

We present an approach to speech recognition that uses only a neural network to map acoustic input to characters, a character-level language model, and a beam search decoding procedure. This approach eliminates much of the complex infrastructure of modern speech recognition systems, making it possible to directly train a speech recognizer using errors generated by spoken language understanding tasks. The system naturally handles out of vocabulary words and spoken word fragments. We demonstrate our approach using the challenging Switchboard telephone conversation transcription task, achieving a word error rate competitive with existing baseline systems. To our knowledge, this is the first entirely neural-network-based system to achieve strong speech transcription results on a conversational speech task. We analyze qualitative differences between transcriptions produced by our lexicon-free approach and transcriptions produced by a standard speech recognition system. Finally, we evaluate the impact of large context neural network character language models as compared to standard n-gram models within our framework.
机译:我们提出了一种语音识别方法,该方法仅使用神经网络将声学输入映射到字符,字符级语言模型和波束搜索解码过程。这种方法消除了现代语音识别系统的许多复杂基础结构,从而可以使用口语理解任务所产生的错误直接训练语音识别器。该系统自然可以处理词汇单词和口语单词片段。我们使用具有挑战性的总机电话对话转录任务演示了我们的方法,实现了与现有基准系统相媲美的单词错误率。据我们所知,这是第一个完全基于神经网络的系统,可以在会话语音任务中实现强大的语音转录结果。我们分析了无词典方法产生的转录与标准语音识别系统产生的转录之间的质量差异。最后,我们评估了与我们框架内的标准n-gram模型相比,大上下文神经网络字符语言模型的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号