...
首页> 外文期刊>International Journal of Mobile Communications >Multimodal systems for speech recognition
【24h】

Multimodal systems for speech recognition

机译:语音识别的多模式系统

获取原文
获取原文并翻译 | 示例

摘要

In this article, we have implemented a system of multimodal recognition of Kazakh speech, based on speech and lip recognition. During the feature extraction phase, several methods have been used, such as voice activity detection (VAD), mel-frequency cepstral coefficients, perceptual linear prediction, relative perceptual linear prediction, and their first-order time derivatives. The main problems of recognition of Kazakh speech, VAD algorithms and speech segmentation, lip movement recognition are considered in the article. The description of probabilistic modelling of audiovisual speech based on coupled hidden Markov models (HMMs), information fusion methods with weight coefficients for audio and video speech modalities, and parametric representation of signals is provided. Quantitative results in multimodal recognition of continuous Kazakh speech indicate high accuracy and reliability of the automatic system. This approach has been used and compared in terms of computational time and recognition speed and gives very interesting results.
机译:在本文中,我们基于言语和嘴唇识别实施了哈萨克斯言论的多式式识别系统。在特征提取阶段期间,已经使用了几种方法,例如语音活动检测(VAD),熔融频率谱系齐,感知线性预测,相对感知线性预测,以及它们的一阶时间衍生物。在文章中考虑了哈萨克语演讲,VAD算法和语音分割,唇部运动识别的主要问题。基于耦合隐马尔可夫模型(HMMS)的视听语音概率建模的描述,提供了音频和视频语音模态的权重系数的信息融合方法,以及信号的参数表示。连续哈萨克语言论多式识别的定量结果表明了自动系统的高精度和可靠性。在计算时间和识别速度方面已经使用并比较了这种方法,并提供了非常有趣的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号