【24h】

Phoneme set selection for russian speech recognition

机译:俄语语音识别的音素设置选择

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we describe a method for phoneme set selection based on combination of phonological and statistical information and its application for Russian speech recognition. For Russian language, currently used phoneme sets are mostly rule-based or heuristically derived from the standard SAMPA or IPA phonetic alphabets. However, for some other languages, statistical methods have been found useful for phoneme set optimization. In Russian language, almost all phonemes come in pairs: consonants can be hard or soft and vowels stressed or unstressed. First, we start with a big phoneme set and then gradually reduce it by merging phoneme pairs. Decision, which pair to merge, is based on phonetic pronunciation rules and statistics obtained from confusion matrix of phoneme recognition experiments. Applying this approach to the IPA Russian phonetic set, we first reduced it to 47 phonemes, which were used as initial set in the subsequent speech model training. Based on the phoneme confusion results, we derived several other phoneme sets with different number of phonemes down to 27. Speech recognition experiments using these sets showed that the reduced phoneme sets are better than the initial phoneme set for phoneme recognition and as good for word level speech recognition.
机译:在本文中,我们描述了一种基于语音和统计信息结合的音素集选择方法及其在俄语语音识别中的应用。对于俄语,当前使用的音素集主要是基于规则的或启发式地从标准SAMPA或IPA语音字母派生而来的。但是,对于其他一些语言,已发现统计方法可用于音素集优化。在俄语中,几乎所有的音素都是成对出现的:辅音可以是硬的也可以是软的,元音可以重读或不重读。首先,我们从一个大的音素集开始,然后通过合并音素对来逐渐减少它。根据语音发音规则和从音素识别实验的混淆矩阵中获得的统计信息,决定合并哪个对。将这种方法应用于IPA俄语注音集,我们首先将其缩减为47个音素,这些音素在随后的语音模型训练中用作初始集。根据音素混淆结果,我们导出了其他多个音素集,这些音素集具有多达27个不同的音素。使用这些音素集的语音识别实验表明,简化后的音素集在识别音素方面比初始音素集更好,并且在单词级别方面也很不错语音识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号