首页> 外文期刊>IEEE transactions on audio, speech and language processing >Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition
【24h】

Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition

机译:利用语音的时间相关性以实现鲁棒性和带宽灵活的分布式语音识别

获取原文
获取原文并翻译 | 示例

摘要

In this paper, the temporal correlation of speech is exploited in front-end feature extraction, client-based error recovery, and server-based error concealment (EC) for distributed speech recognition. First, the paper investigates a half frame rate (HFR) front-end that uses double frame shifting at the client side. At the server side, each HFR feature vector is duplicated to construct a full frame rate (FFR) feature sequence. This HFR front-end gives comparable performance to the FFR front-end but contains only half the FFR features. Second, different arrangements of the other half of the FFR features creates a set of error recovery techniques encompassing multiple description coding and interleaving schemes where interleaving has the advantage of not introducing a delay when there are no transmission errors. Third, a subvector-based EC technique is presented where error detection and concealment is conducted at the subvector level as opposed to conventional techniques where an entire vector is replaced even though only a single bit error occurs. The subvector EC is further combined with weighted Viterbi decoding. Encouraging recognition results are observed for the proposed techniques. Lastly, to understand the effects of applying various EC techniques, this paper introduces three approaches consisting of speech feature, dynamic programming distance, and hidden Markov model state duration comparison
机译:本文在分布式语音识别的前端特征提取,基于客户端的错误恢复和基于服务器的错误隐藏(EC)中利用了语音的时间相关性。首先,本文研究了在客户端使用双帧移位的半帧速率(HFR)前端。在服务器端,每个HFR特征向量都被复制以构建完整帧速率(FFR)特征序列。该HFR前端可提供与FFR前端相当的性能,但仅包含一半的FFR功能。其次,FFR特征的另一半的不同安排创建了一套包含多个描述编码和交织方案的错误恢复技术,其中交织的优点是在没有传输错误时不会引入延迟。第三,提出了一种基于子向量的EC技术,其中在子向量级别进行错误检测和隐藏,这与常规技术相反,在传统技术中,即使仅发生单个位错误,也要替换整个向量。子向量EC进一步与加权维特比解码结合。令人鼓舞的识别结果观察到的建议的技术。最后,为了了解应用各种EC技术的效果,本文介绍了三种方法,包括语音特征,动态编程距离和隐马尔可夫模型状态持续时间比较

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号