首页> 外文期刊>WSEAS Transactions on Signal Processing >SYLLABLE-BASED AUTOMATIC ARABIC SPEECH RECOGNITION IN NOISY-TELEPHONE CHANNEL
【24h】

SYLLABLE-BASED AUTOMATIC ARABIC SPEECH RECOGNITION IN NOISY-TELEPHONE CHANNEL

机译:语音电话中基于节的自动阿拉伯语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

The performance of well-trained speech recognizers using high quality full bandwidth speech data is usually degraded when used in real world environments. In particular, telephone speech recognition is extremely difficult due to the limited bandwidth of transmission channels. In this paper, we concentrate on the telephone recognition of Egyptian Arabic speech using syllables. Arabic spoken digits were described by showing their constructing phonemes, triphones, syllables and words. Speaker-independent hidden Markov models (HMMs)-based speech recognition system was designed using Hidden Markov model toolkit (HTK). The database used for both training and testing consists from forty-four Egyptian speakers. In clean environment, experiments show that the recognition rate using syllables outperformed the rate obtained using monophones, triphones and words by 2.68%, 1.19% and 1.79% respectively. Also in noisy telephone channel, syllables outperformed the rate obtained using monophones, triphones and words by 2.09%, 1.5% and 0.9% respectively. Comparative experiments have indicated that the use of syllables as acoustic units leads to an improvement in the recognition performance of HMM-based ASR systems in noisy environments. A syllable unit spans a longer time frame, typically three phones, thereby offering a more parsimonious framework for modeling pronunciation variation in spontaneous speech. Moreover, syllable-based recognition has relatively smaller number of used units and runs faster than word-based recognition.
机译:当在现实环境中使用时,使用高质量全带宽语音数据的训练有素的语音识别器的性能通常会降低。特别地,由于传输信道的带宽有限,电话语音识别非常困难。在本文中,我们重点研究使用音节对埃及阿拉伯语音的电话识别。阿拉伯语的语音数字通过显示其构成音素,三音,音节和单词来描述。使用隐马尔可夫模型工具包(HTK)设计了基于说话者无关的隐马尔可夫模型(HMM)的语音识别系统。用于培训和测试的数据库由四十四名埃及讲者组成。在干净的环境中,实验表明,使用音节的识别率分别比单音,三音和单词的识别率分别高2.68%,1.19%和1.79%。同样在嘈杂的电话频道中,音节分别比使用单音,三音和单词获得的音高2.09%,1.5%和0.9%。比较实验表明,将音节用作声学单位会导致在嘈杂环境中基于HMM的ASR系统的识别性能得到改善。一个音节单元跨越一个较长的时间范围,通常是三个电话,从而为模拟自发语音中的发音变化提供了更为简洁的框架。此外,基于音节的识别具有相对较少的使用单位数量,并且比基于单词的识别运行得更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号