首页> 外文期刊>Indian Journal of Science and Technology >Lip Detection and Lip Geometric Feature Extraction using Constrained Local Model for Spoken Language Identification using Visual Speech Recognition
【24h】

Lip Detection and Lip Geometric Feature Extraction using Constrained Local Model for Spoken Language Identification using Visual Speech Recognition

机译:基于视觉语音识别的受限局部模型用于口语识别的嘴唇检测和嘴唇几何特征提取

获取原文
           

摘要

Background/Objectives: The aim of our research is to guess the language of spoken utterance by using the cues from visual speech recognition i.e. from movement of lips. The first step towards this task is to detect lips form face image and then to extract various geometric features of lip shape in order to guess the utterance. Methods/Statistical Analysis: This paper presents the methodology for detecting lips from face images using constrained local model (CLM) and then extracting the geometric features of lip shape. The two steps involved in lip detection are CLM model building and CLM search. For extracting lip geometric features, twenty feature points are defined on lips and lip height, width, area are defined using these twenty feature points. Findings: CLM model is build using images from FGnet Talking face video database and tested using images from FGnet Talking face video database and also using other images. The detection accuracy is more for FGnet images as compare to other images. Feature vector defining the lip shape consists of geometric parameters like height, width and area of inner and outer lip contours. Feature vector is calculated for all test images after detecting lips from face image. So the error in detecting lips leads to the error in feature vector. This indicates the speaker dependency of visual speech recognition systems. Application/Improvements: The proposed approach is useful in visual speech recognition for lip detection and feature extraction. Minimizing the speaker dependency and generalizing the approach should be considered for further improvements.
机译:背景/目的:我们研究的目的是通过使用视觉语音识别(即嘴唇运动)的线索来猜测口头表达的语言。进行此任务的第一步是检测嘴唇形成的脸部图像,然后提取嘴唇形状的各种几何特征以猜测其发音。方法/统计分析:本文介绍了使用约束局部模型(CLM)从面部图像中检测嘴唇,然后提取嘴唇形状的几何特征的方法。嘴唇检测涉及的两个步骤是CLM模型建立和CLM搜索。为了提取嘴唇的几何特征,在嘴唇上定义了二十个特征点,并使用这二十个特征点定义了嘴唇的高度,宽度和面积。结果:CLM模型是使用FGnet Talking面部视频数据库中的图像构建的,并使用FGnet Talking面部视频数据库中的图像以及其他图像进行了测试。与其他图像相比,FGnet图像的检测精度更高。定义嘴唇形状的特征向量由几何参数组成,例如高度,宽度和内部和外部嘴唇轮廓的面积。从面部图像检测到嘴唇后,为所有测试图像计算特征向量。因此检测嘴唇的错误导致特征向量的错误。这表明视觉语音识别系统的说话者依赖性。应用/改进:所提出的方法可用于视觉语音识别中的嘴唇检测和特征提取。为了进一步改进,应考虑使说话者的依赖性最小化并推广该方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号