首页> 外文期刊>IEEE transactions on audio, speech and language processing >Discriminatively Trained GMMs for Language Classification Using Boosting Methods
【24h】

Discriminatively Trained GMMs for Language Classification Using Boosting Methods

机译:使用Boosting方法进行语言分类的经过训练的GMM

获取原文
获取原文并翻译 | 示例

摘要

In language identification and other speech applications, discriminatively trained models often outperform nondiscriminative models trained with the maximum-likelihood criterion. For instance, discriminative Gaussian mixture models (GMMs) are typically trained by optimizing some discriminative criteria that can be computationally expensive and complex to implement. In this paper, we explore a novel approach to discriminative GMM training by using a variant the boosting framework (R. Schapire, ldquoThe boosting approach to machine learning, an overview,rdquo Proc. MSRI Workshop on Nonlinear Estimation and Classification, 2002) from machine learning, in which an ensemble of GMMs is trained sequentially. We have extended the purview of boosting to class conditional models (as opposed to discriminative models such as classification trees). The effectiveness of our boosting variation comes from the emphasis on working with the misclassified data to achieve discriminatively trained models. Our variant of boosting also includes utilizing low confidence data classifications as well as misclassified examples in classifier generation. We further apply our boosting approach to anti-models to achieve additional performance gains. We have applied our discriminative training approach to a variety of language identification experiments using the 12-language NIST 2003 language identification task. We show the significant performance improvements that can be obtained. The experiments include both acoustic as well as token-based speech models. Our best performing boosted GMM-based system on the 12-language verification task has a 2.3% EER.
机译:在语言识别和其他语音应用中,经过判别训练的模型通常胜过使用最大似然准则训练的非歧视模型。例如,判别式高斯混合模型(GMM)通常是通过优化一些判别标准来训练的,这些判别标准可能在计算上昂贵且实现起来很复杂。在本文中,我们通过使用机器的提升框架(R. Schapire,ldquo“机器学习的提升方法”,概述,《 MSRI非线性估计和分类》 Proc。MSRI研讨会,2002)探索了一种用于判别GMM训练的新颖方法。学习,其中GMM的集合按顺序进行训练。我们扩展了扩展到类条件模型的权限(与诸如分类树之类的判别模型相反)。我们不断变化的结果的有效性来自于对错误分类的数据进行工作以实现经过区分训练的模型的重视。我们的增强方法还包括在分类器生成中利用低置信度数据分类以及分类错误的示例。我们进一步将增强方法应用于反模型,以实现额外的性能提升。我们使用12语言NIST 2003语言识别任务将判别式训练方法应用于各种语言识别实验。我们展示了可以实现的重大性能改进。实验包括声学和基于令牌的语音模型。我们在12语言验证任务上性能最佳的增强型基于GMM的系统的EER为2.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号