【24h】

Viseme recognition - a comparative study

机译:视野识识 - 比较研究

获取原文

摘要

Three classification algorithms for visual mouth appearances (visemes) which correspond to phonemes and their speech contexts, were compared w.rt. recognition rate, time complexity, and ROC performance. Two feature extraction procedures were verified. The first one is based on the normalized triangle MESH covering mouth area and the color image texture vector indexed by barycentric coordinates. The second procedure performs DFT on the image rectangle including mouth w.rt. small blocks of DFT coefficients. The classifiers has been designed by PCA approach and by the optimized LDA method which uses two singular subspaces approach. It appears that DFT+LDA exhibits higher recognition rate than MESH+LDA and MESH+PCA methods - 97.6% versus 94.4 and 90.2%, respectively. It is also much faster than MESH+PCA (5 ms per one video frame versus 26 ms on Pentium IV, 3.2 GHz) and slower than MESH+LDA (5 ms versus 1 ms).
机译:对应于音素及其语音上下文的视觉嘴巴外观(探测)的三种分类算法。识别率,时间复杂性和ROC性能。验证了两个特征提取程序。第一个基于标准化的三角形网格覆盖口区域和由重心坐标索引的彩色图像纹理矢量。第二个程序在包括嘴巴的图像矩形上执行DFT。小型DFT系数块。分类器由PCA方法设计,并通过使用两个奇异子空间方法的优化LDA方法设计。似乎DFT + LDA表现出比网格+ LDA和网格+ PCA方法更高的识别率 - 97.6%分别与94.4和90.2%。它比网格+ PCA(每一个视频帧为5 ms对奔腾IV,3.2 GHz)和比网格+ LDA(5毫秒为1毫秒)的速度快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号