首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >Semi-automatic acoustic model generation from large unsynchronized audio and text chunks
【24h】

Semi-automatic acoustic model generation from large unsynchronized audio and text chunks

机译:从大型不同步的音频和文本块生成半自动声学模型

获取原文

摘要

In this paper an effective technique to train an acoustic model from large and unsynchronized audio and text chunks is presented. Given such a speech corpus, an algorithm to automatically segment each chunk into smaller fragments and to synchronize those to the corresponding text is defined. These smaller fragments are more suitable to be used in standard model training algorithms for usage in automatic speech recognition systems. The proposed approach is particularly suitable to bootstrap language models without relying neither on specialized training material nor borrowing from models trained for other similar languages. Extensive experimentation using the CMU Sphinx 4 recognizer and the SphinxTrain model generator in a setting designed for large-vocabulary continuous speech recognition shows the effectiveness of the approach.
机译:在本文中,提出了一种有效的技术,该技术可以从大型且不同步的音频和文本块中训练声学模型。给定这样的语音语料库,定义了一种算法,该算法可自动将每个块分割成较小的片段,并将其与相应的文本同步。这些较小的片段更适合用于自动语音识别系统中的标准模型训练算法中。所提出的方法特别适合于引导语言模型,而不依赖于专门的培训材料,也无需从针对其他类似语言训练的模型中借用。在为大词汇量连续语音识别而设计的环境中,使用CMU Sphinx 4识别器和SphinxTrain模型生成器进行了广泛的实验,证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号