首页> 外文期刊>IEICE Transactions on Information and Systems >Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System
【24h】

Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System

机译:很大的词汇匈牙利听写系统中的语音学和形态学建模

获取原文
获取原文并翻译 | 示例
       

摘要

This article introduces a novel approach to model phonology and morphosyntax in morpheme unit-based speech recognizers. The proposed methods are evaluated on a Hungarian newspaper dictation task that requires modeling over 1 million different word forms. The architecture of the recognition system is based on the weighted finite-state transducer (WFST) paradigm. The vocabulary units used in the system are morpheme-based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding. Besides the basic pronunciation model and the morpheme N-gram language model we evaluate a novel phonology model and the novel stochastic morphosyntac-tic language model (SMLM). Thanks to the flexible transducer-based architecture of the system, these new components are integrated seamlessly with the basic modules with no need to modify the decoder itself. We compare the phoneme, morpheme, and word error-rates as well as the sizes of the recognition networks in two configurations. In one configuration we use only the N-gram model while in the other we use the combined model. The proposed stochastic morphosyntactic language model decreases the morpheme error rate by between 1.7 and 7.2% relatively when compared to the baseline trigram system. The proposed phonology model reduced the error rate by 8.32%. The morpheme error-rate of the best configuration is 18% and the best word error-rate is 22.3%.
机译:本文介绍了一种新的方法,用于基于词素单元的语音识别器中的模型语音学和词素语法。匈牙利报纸的听写任务对提出的方法进行了评估,该任务要求对超过一百万种不同的单词形式进行建模。识别系统的体系结构基于加权有限状态传感器(WFST)范例。系统中使用的词汇单位是基于词素的,以便充分覆盖因粘贴和复合而产生的大量单词形式。除了基本的发音模型和语素N-gram语言模型外,我们还评估了一种新的语音模型和一种新颖的随机语态-合态语言模型(SMLM)。由于系统基于传感器的灵活架构,这些新组件与基本模块无缝集成,而无需修改解码器本身。我们比较了两种配置中的音素,语素和单词错误率以及识别网络的大小。在一种配置中,我们仅使用N-gram模型,而在另一种配置中,我们使用组合模型。与基线三字母组合系统相比,所提出的随机语态句法语言模型将语素错误率降低了1.7%至7.2%。所提出的语音模型使错误率降低了8.32%。最佳配置的词素错误率是18%,最佳字词错误率是22.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号