首页> 外文期刊>Computer speech and language >Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation
【24h】

Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation

机译:鲁棒无监督声学模型自适应的差分最大互信息准则

获取原文
获取原文并翻译 | 示例

摘要

Discriminative criteria have been widely used for training acoustic models for automatic speech recognition (ASR). Many discriminative criteria have been proposed including maximum mutual information (MMI), minimum phone error (MPE), and boosted MMI (BMMI). Discriminative training is known to provide significant performance gains over conventional maximum-likelihood (ML) training. However, as discriminative criteria aim at direct minimization of the classification error, they strongly rely on having accurate reference labels. Errors in the reference labels directly affect the performance. Recently, the differenced MMI (dMMI) criterion has been proposed for generalizing conventional criteria such as BMMI and MPE. dMMI can approach BMMI or MPE if its hyper-parameters are properly set. Moreover, dMMI introduces intermediate criteria that can be interpreted as smoothed versions of BMMI or MPE. These smoothed criteria are robust to errors in the reference labels. In this paper, we demonstrate the effect of dMMI on unsupervised speaker adaptation where the reference labels are estimated from a first recognition pass and thus inevitably contain errors. In particular, we introduce dMMI-based linear regression (dMMI-LR) adaptation and demonstrate significant gains in performance compared with MLLR and BMMI-LR in two large vocabulary lecture recognition tasks.
机译:判别标准已广泛用于训练自动语音识别(ASR)的声学模型。已经提出了许多判别标准,包括最大互信息(MMI),最小电话错误(MPE)和增强MMI(BMMI)。与传统的最大似然(ML)训练相比,判别式训练可显着提高性能。但是,由于区分标准旨在直接最大程度地减少分类误差,因此它们强烈依赖于具有准确的参考标签。参考标签中的错误会直接影响性能。最近,已提出了差异化MMI(dMMI)准则来推广常规准则,例如BMMI和MPE。如果正确设置dMMI的超参数,则可以接近BMMI或MPE。此外,dMMI引入了中间标准,这些标准可以解释为BMMI或MPE的平滑版本。这些平滑的标准对于参考标签中的错误具有鲁棒性。在本文中,我们演示了dMMI对无监督说话人适应的影响,其中参考标记是根据首次识别通过估计的,因此不可避免地包含错误。特别是,我们介绍了基于dMMI的线性回归(dMMI-LR)适应性,并证明了在两个大型词汇讲座识别任务中,与MLLR和BMMI-LR相比,其性能显着提高。

著录项

  • 来源
    《Computer speech and language》 |2016年第3期|24-41|共18页
  • 作者单位

    NTT Communication Science Laboratories, NTT Corporation, 2-4, Hikaridai Seika-cho, Souraku-gun, Kyoto 619-0237, Japan;

    NTT Communication Science Laboratories, NTT Corporation, 2-4, Hikaridai Seika-cho, Souraku-gun, Kyoto 619-0237, Japan;

    NTT Communication Science Laboratories, NTT Corporation, 2-4, Hikaridai Seika-cho, Souraku-gun, Kyoto 619-0237, Japan,University of Texas at Dallas (UTD), United States;

    NTT Communication Science Laboratories, NTT Corporation, 2-4, Hikaridai Seika-cho, Souraku-gun, Kyoto 619-0237, Japan;

    NTT Communication Science Laboratories, NTT Corporation, 2-4, Hikaridai Seika-cho, Souraku-gun, Kyoto 619-0237, Japan,Graduate School of Natural Sciences, Nagoya City University, Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Discriminative criterion; Differenced maximum mutual information; Speech recognition; Acoustic model adaptation; Unsupervised adaptation;

    机译:判别标准;最大相互信息差异;语音识别;声学模型适应;无监督适应;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号