首页> 外文期刊>Computer speech and language >Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion
【24h】

Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion

机译:主动学习和半监督学习的语音识别:使用全局熵减少最大化准则的统一框架

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We propose a unified global entropy reduction maximization (GERM) framework for active learning and semi-supervised learning for speech recognition. Active learning aims to select a limited subset of utterances for transcribing from a large amount of un-transcribed utterances, while semi-supervised learning addresses the problem of selecting right transcriptions for un-transcribed utterances, so that the accuracy of the automatic speech recognition system can be maximized. We show that both the traditional confidence-based active learning and semi-supervised learning approaches can be improved by maximizing the lattice entropy reduction over the whole dataset. We introduce our criterion and framework, show how the criterion can be simplified and approximated, and describe how these approaches can be combined. We demonstrate the effectiveness of our new framework and algorithm with directory assistance data collected under the real usage scenarios and show that our GERM based active learning and semi-supervised learning algorithms consistently outperform the confidence-based counterparts by a significant margin. Using our new active learning algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the confidence-based active learning approach, and by 60% compared to the random sampling approach. Using our new semi-supervised algorithm we can determine the cutoff point in determining which utterance-transcription pair to use in a principled way by demonstrating that the point it finds is very close to the achievable peak point.
机译:我们提出了一个统一的全局熵减少最大化(GERM)框架,用于主动学习和用于语音识别的半监督学习。主动学习的目的是从大量未转录的语音中选择一部分话语进行转录,而半监督学习则解决了为未转录的语音选择正确转录的问题,从而提高了自动语音识别系统的准确性可以最大化。我们表明,可以通过在整个数据集上最大化晶格熵的减少来改善传统的基于置信度的主动学习和半监督学习方法。我们介绍了我们的标准和框架,展示了如何简化和近似标准,并描述了如何将这些方法结合起来。我们通过在实际使用情况下收集的目录服务数据来证明我们的新框架和算法的有效性,并表明我们基于GERM的主动学习和半监督学习算法始终明显优于基于置信度的同类算法。使用我们的新的主动学习算法可以将转录所需的发声数量减少50%,以达到使用基于置信度的主动学习方法所获得的相同识别精度,与随机采样方法相比,可以减少60%。使用我们的新的半监督算法,我们可以通过证明发现的点非常接近可达到的峰值,从而在确定使用哪种发声转录对时确定临界点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号