首页> 外国专利> Acoustic and visual input speech recognition system - monitors lip and mouth movements by video camera to provide motion vector input to neural network based speech identification unit.

Acoustic and visual input speech recognition system - monitors lip and mouth movements by video camera to provide motion vector input to neural network based speech identification unit.

机译:声音和视觉输入语音识别系统-通过摄像机监视嘴唇和嘴巴的运动,以将运动矢量输入提供给基于神经网络的语音识别单元。

摘要

The speech identification system has both acoustic data input from a microphone (20) and video information obtained by monitoring lips and mouth movements of speaker with a video camera (10). A position vector generator (14) provides input to an interpolator (26) that generates a characteristic for input to a speech classification unit (200). This also receives the output of a spectral analyser operating on the acoustic signals. The speech classification unit is in the form of a multi layer time delay neural network. This responds to the spectral data for successful certification. USE/ADVANTAGE - Improves probability of correct speech identification by persons with different accents, different sex, different speed of speech, different degrees of coherency.
机译:语音识别系统既具有从麦克风(20)输入的声学数据,又具有通过用摄像机(10)监视扬声器的嘴唇和嘴巴运动而获得的视频信息。位置矢量生成器(14)向内插器(26)提供输入,内插器(26)生成用于输入到语音分类单元(200)的特性。这也接收对声信号进行操作的频谱分析仪的输出。语音分类单元是多层时延神经网络的形式。这将对光谱数据作出响应,以成功进行认证。使用/优势-提高具有不同口音,不同性别,不同语音速度,不同连贯性的人正确识别语音的可能性。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号