首页> 外文期刊>Computer speech and language >Sparse coding over redundant dictionaries for fast adaptation of speech recognition system
【24h】

Sparse coding over redundant dictionaries for fast adaptation of speech recognition system

机译:冗余字典上的稀疏编码可快速适应语音识别系统

获取原文
获取原文并翻译 | 示例

摘要

This work presents a novel use of the sparse coding over redundant dictionary for fast adaptation of the acoustic models in the hidden Markov model-based automatic speech recognition (ASR) systems. The presented work is an extension of the existing acoustic model-interpolation-based fast adaptation approaches. In these methods, the basis (model) weights are estimated using an iterative procedure employing the maximum-likelihood (ML) criterion. For effective adaptation, typically a number of bases are selected and as a result of that the latency of the iterative weight estimation process becomes high for those ASR tasks that involve human-machine interactions. To address this issue, we propose the use of sparse coding of the target mean supervector over a speaker-specific (exemplar) redundant dictionary. In this approach, the employed greedy sparse coding not only selects the desired bases but also compresses them into a single supervector, which is then ML scaled to yield the adapted mean parameters. Thus reducing the latency in the basis weight estimation in comparison to the existing fast adaptation techniques. Further, to address the loss in information due to reduced degrees of freedom, we have also extended the proposed approach using separate sparse codings over multiple (exemplar and learned) redundant dictionaries. In adapting an ASR task involving human-computer interactions, the proposed approach is found to be as effective as the existing techniques but with a substantial reduction in the computational cost.
机译:这项工作提出了在冗余字典上的稀疏编码的新用途,用于在基于隐马尔可夫模型的自动语音识别(ASR)系统中快速调整声学模型。提出的工作是对现有的基于声学模型插值的快速自适应方法的扩展。在这些方法中,基础(模型)权重使用采用最大似然(ML)准则的迭代过程进行估算。为了有效地适应,通常选择许多碱基,结果,对于涉及人机交互的那些ASR任务,迭代权重估计过程的等待时间变高。为了解决这个问题,我们建议在说话者特定的(示例性)冗余字典上使用目标均值超向量的稀疏编码。在这种方法中,所采用的贪婪稀疏编码不仅选择所需的碱基,而且将其压缩为单个超向量,然后对其进行ML缩放以产生自适应的平均参数。因此,与现有的快速自适应技术相比,减少了基重估计中的等待时间。此外,为了解决由于自由度降低而导致的信息丢失,我们还扩展了在多个(示例性和学习型)冗余字典上使用单独的稀疏编码的建议方法。在适应涉及人机交互的ASR任务时,发现该方法与现有技术一样有效,但计算成本却大大降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号