首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings
【24h】

Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings

机译:通过喉咙和声学麦克风录音的联合分析来改善喉咙麦克风的语音识别

获取原文
获取原文并翻译 | 示例

摘要

We present a new framework for joint analysis of throat and acoustic microphone (TAM) recordings to improve throat microphone only speech recognition. The proposed analysis framework aims to learn joint sub-phone patterns of throat and acoustic microphone recordings through a parallel branch HMM structure. The joint sub-phone patterns define temporally correlated neighborhoods, in which a linear prediction filter estimates a spectrally rich acoustic feature vector from throat feature vectors. Multimodal speech recognition with throat and throat-driven acoustic features significantly improves throat-only speech recognition performance. Experimental evaluations on a parallel TAM database yield benchmark phoneme recognition rates for throat-only and multimodal TAM speech recognition systems as 46.81% and 60.69%, respectively. The proposed throat-driven multimodal speech recognition system improves phoneme recognition rate to 52.58%, a significant relative improvement with respect to the throat-only speech recognition benchmark system.
机译:我们提出了一种用于联合分析喉咙和声学麦克风(TAM)录音的新框架,以改善仅喉咙麦克风的语音识别。拟议的分析框架旨在通过并行分支HMM结构来学习喉咙和声学麦克风录音的联合子电话模式。联合子电话模式定义了时间相关的邻域,其中线性预测滤波器从喉咙特征向量估计频谱丰富的声学特征向量。具有喉咙和喉咙驱动的声学特征的多模式语音识别可显着提高仅喉咙的语音识别性能。在并行TAM数据库上进行的实验评估得出,仅喉咙和多模式TAM语音识别系统的基准音素识别率分别为46.81%和60.69%。拟议的喉咙驱动多模式语音识别系统将音素识别率提高到52.58%,相对于仅喉咙的语音识别基准系统而言,这是一个显着的相对改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号