首页> 外文OA文献 >Hyper column model vs. fast DCT for feature extraction in visual arabic speech recognition
【2h】

Hyper column model vs. fast DCT for feature extraction in visual arabic speech recognition

机译:Visual阿拉伯语语音识别中的特征提取功能提取的超柱模型与快速DCT

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recently, the multimedia signal processing community has shown increasing interest for research development on visual speech recognition domain. In this paper we present a novel visual speech recognition approach based on our model hyper column model (HCM). HCM is used for feature extraction task. The extracted features are modeled by Gaussian distributions through using hidden Markov model (HMM). The proposed system, HCM and HMM, can be used for any visual recognition task. We use it here to comprise a complete lip-reading system and evaluate its performance using Arabic database set. According to our knowledge, this is the first time that visual speech recognition is applied for Arabic language. Toward fair evaluation we compare our accuracy results with those using fast discrete cosine transform (FDCT) approach, in a separate experiment and using same data set and conditions of HCM experiment. Comparison turns out that HCM shows higher recognition accuracy than FDCT for Arabic sentences and words. HCM does not provide higher accuracy only but also it capable to achieve shift invariant recognition whereas FDCT can not.
机译:最近,多媒体信号处理社区对视觉语音识别域的研究开发表现出了越来越令人利益。在本文中,我们提出了一种基于模型超列模型(HCM)的新型视觉语音识别方法。 HCM用于特征提取任务。通过使用隐马尔可夫模型(HMM),通过高斯分布建模提取的特征。所提出的系统,HCM和HMM,可用于任何可视识别任务。我们在此处使用它来包括完整的唇读系统,并使用阿拉伯数据库集评估其性能。根据我们的知识,这是第一次适用于阿拉伯语的可视语音识别。为了公平评估,我们将我们的准确性结果与使用快速离散余弦变换(FDCT)方法的准确性结果进行比较,并在单独的实验中使用相同的数据集和HCM实验条件。比较结果,HCM显示比阿拉伯语句子和单词的FDCT更高的识别精度。 HCM仅提供更高的准确性,但也能够实现换档不变识别,而FDCT不能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号