首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP >Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language
【24h】

Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language

机译:凝集语言自动语音识别的词条选择的判别方法

获取原文

摘要

In agglutinative languages, selection of lexical unit is not obvious. Morpheme unit is usually adopted to ensure the sufficient coverage, but many morphemes are short, resulting in weak constraints and possible confusions. In this paper, we propose a discriminative approach to select lexical entries which will directly contribute to ASR error reduction. We define an evaluation function for each word by a set of features and their weights, and the measure for optimization by the difference of WERs by the morpheme-based model and by the word-based model. Then, the weights of the features are learned by a perceptron algorithm. Finally, word (or sub-word) entries with higher evaluation scores are selected to be added to the lexicon. This method is successfully applied to an Uyghur large-vocabulary continuous speech recognition system, resulting in a significant reduction of WER and the lexicon size. Further improvement is achieved by combining with a statistical method based on mutual information criterion.
机译:在凝集语言中,词汇单位的选择并不明显。通常采用语素单位来确保足够的覆盖范围,但是许多语素很短,导致约束力弱和可能的混淆。在本文中,我们提出了一种判别方法来选择词汇条目,这将直接有助于减少ASR错误。我们通过一组特征及其权重来定义每个单词的评估函数,并通过基于词素的模型和基于单词的模型通过WER的差异来定义优化措施。然后,通过感知器算法学习特征的权重。最后,选择具有较高评估分数的单词(或子单词)条目,以将其添加到词典中。该方法已成功应用于维吾尔语大词汇量连续语音识别系统,从而显着减少了WER和词典大小。通过与基于互信息标准的统计方法相结合,可以实现进一步的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号