首页> 外文期刊>Computer speech and language >Efficient computation of the frame-based extended union model and its application in speech recognition against partial temporal corruptions
【24h】

Efficient computation of the frame-based extended union model and its application in speech recognition against partial temporal corruptions

机译:基于帧的扩展联合模型的高效计算及其在针对部分时间破坏的语音识别中的应用

获取原文
获取原文并翻译 | 示例

摘要

The extended union model (EUM) was recently proposed and shown to be effective in handling short time temporal corruption. Because of the computational complexity, the EUM probability can only be computed over groups of consecutive observations (called segments) and recognition can only be performed under N-best re-scoring paradigm. In this paper, we introduce a hidden variable called “pattern of corruption” and re-formulate the extended union model as marginalizing over possible patterns of corruption with likelihood computed via the missing feature theory. We then introduce a recursive relationship between the EUM probabilities of two successive observation sequences that can greatly simplify the EUM probability computation. This makes it possible to compute the EUM probability over a long sequence. Using this recursive relationship, the EUM probability over frames, called the “frame-based EUM” can easily be computed. To simplify the EUM-based recognition, we propose an approximated, dynamic programming-based EUM recognition algorithm, called the Frame-based EUM Viterbi algorithm (FEVA), that performs recognition directly instead of via N-best re-scoring. Experimental results on digit recognition under added impulsive noises show that both the frame-base EUM and the FEVA outperform the segment-based EUM.
机译:最近提出了扩展联合模型(EUM),并显示出在处理短期时间腐败方面有效。由于计算复杂性,EUM概率只能在连续观察的组(称为片段)上计算,并且只能在N最佳重评分范式下执行识别。在本文中,我们引入了一个称为“腐败模式”的隐藏变量,并重新构造了扩展联合模型,以边缘化可能的腐败模式,并通过缺失特征理论计算出了可能性。然后,我们介绍两个连续观察序列的EUM概率之间的递归关系,该关系可以大大简化EUM概率计算。这使得可以计算较长序列的EUM概率。使用这种递归关系,可以轻松计算帧上的EUM概率,称为“基于帧的EUM”。为了简化基于EUM的识别,我们提出了一种近似的,基于动态编程的EUM识别算法,称为基于帧的EUM维特比算法(FEVA),该算法直接执行识别,而不是通过N最佳重新评分。在添加脉冲噪声的情况下进行数字识别的实验结果表明,基于帧的EUM和FEVA均优于基于段的EUM。

著录项

  • 来源
    《Computer speech and language》 |2005年第3期|p. 301-319|共19页
  • 作者

    Arthur Chan; Manhung Siu;

  • 作者单位

    Department of EEE, The Hong Kong University of Science and Technology, Clearwater Bay, Hong Kong;

    Department of EEE, The Hong Kong University of Science and Technology, Clearwater Bay, Hong Kong;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号