首页> 中文期刊>中国科技论文 >基于图像信息的话者识别

基于图像信息的话者识别

     

摘要

A speaker recognition scheme based on image information is proposed in this paper.A dataset with 916 samples is constructed, in which each sample includes 20 consecutive images.We achieve the task of speaker recognition based on image information through two steps:all mouth areas of the faces are found by face recognition technology to perform lip movement detection, and the faces which are detected by lip movements are recognized.The paper has designed two different methods to construct lip movement detection model.By obtaining the width of the nose and distance between the upper and lower lips on the face of each image in the sample, the ratio of distance to width is used as the feature for each image.A model can be trained by support vector machine based on these features.Cutting the lips of the face in each image, a convolutional neural network is used to extract the features of the cropped lip images.These features are used as inputs for long short time memory networks, and then the training of temporal classification is carried out.The experiment results show that speaker recognition based on image information can achieve high accuracy.%提出了一种使用图像信息进行话者识别的方案,建立了一个共计916个样本、每个样本包含连续20帧图片的实验数据集.将基于图像信息的话者识别分为借助人脸识别技术找出人脸的嘴唇部分并执行唇动检测和对被检测出唇动的人脸进行人脸识别2个阶段.唇动检测模型通过2种方法获得:计算样本中每帧图片的人脸上下嘴唇间距与鼻部宽度的比例,并将该比例作为该帧图像的特征,基于总体样本特征使用支持向量机进行模型训练;对人脸的嘴唇部分进行裁剪,使用卷积神经网络对裁剪后的嘴唇图片提取特征,并将特征作为长短时记忆网络的输入进行模型的训练.实验结果表明,基于图像信息的话者识别能够达到较高的准确率.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号