首页>
外国专利>
Method and device for extracting a visual feature vector from a sequence of images, and speech recognition system
Method and device for extracting a visual feature vector from a sequence of images, and speech recognition system
展开▼
机译:从图像序列中提取视觉特征向量的方法和装置以及语音识别系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
In a facial feature extraction method and a device for carrying it out, the change in the light intensity of a front view of a speaker's face is used. The sequence of video data is scanned and quantised in a uniform pixel arrangement and form a coordinate system of scan lines and pixel positions. Left/right eye regions and the mouth are determined by the formation of thresholds of the pixel grey scale and finding the centroids of three regions. The line segment which connects the eye region centroids is bisected at a right angle in order to form an axis of symmetry. A straight line through the mouth region centroid forms the mouth line. Pixels along the mouth line and the axis of symmetry form a horizontal/vertical grey scale profile. Selected as feature vectors are maxima and minima of the profile which correspond to important physiological speech features such as lower/upper lip, mouth angle, mouth region positions. A speech recognition system uses the visual feature vector in combination with an accompanying acoustic vector as inputs to a time-delayed neural network.
展开▼