首页> 外文会议>Asian Language Processing, 2009. IALP '09 >Advances in Acoustic Modeling for Vietnamese LVCSR
【24h】

Advances in Acoustic Modeling for Vietnamese LVCSR

机译:越南LVCSR声学建模的进展

获取原文

摘要

In this paper, we present our experiments on the selection of basic phonetic units for the Vietnamese large vocabulary continuous speech recognition (LVCSR). Two acoustic models were compared. The first model has just used vowels or monophthongs as phonemes [2] while the second one, which was proposed in this paper, has explored the use of diphthongs and triphthongs as phonemes as well. The two models were trained and evaluated on a Broadcast News corpus containing 27 hours of acoustic training data and 1 hour of acoustic testing data. Moreover, an 146M-word corpus collection of newspaper was employed for building the language models. Experimental results indicate significant improvements in both word accuracy rate and time-execution. With the second acoustic model, the word accuracy rates reach 86.06% on the best case and the execution time is faster than the real-time.
机译:在本文中,我们介绍了越南大词汇量连续语音识别(LVCSR)的基本语音单位选择的实验。比较了两种声学模型。第一个模型只是使用元音或单音作为音素[2],而本文提出的第二个模型也探讨了双音和三音作为音素的使用。在包含27小时声学训练数据和1小时声学测试数据的广播新闻语料库上对这两种模型进行了训练和评估。此外,使用了一个1.46亿字的报纸语料库集合来构建语言模型。实验结果表明,单词准确率和执行时间都得到了显着提高。在第二种声学模型中,最佳情况下的单词准确率达到86.06%,执行时间比实时更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号