首页> 外国专利> Acoustic speech recognition method and system using stereo vision neural networks with competition and cooperation

Acoustic speech recognition method and system using stereo vision neural networks with competition and cooperation

机译：使用具有竞争与合作关系的立体视觉神经网络的语音识别方法和系统

页面导航

摘要
著录项
相似文献

摘要

A method and system are provided for speech recognition. The speech recognition method includes the steps of preparing training data representing acoustic parameters of each of phonemes at each time frame; receiving an input signal representing a sound to be recognized and converting the input signal to input data; comparing the input data at each frame with the training data of each of the phonemes to derive a similarity measure of the input data with respect to each of the phonemes; and processing the similarity measures obtained in the comparing step using a neural net model governing development of activities of plural cells to conduct speech recognition of the input signal. In the processing step, each cell is associated with one respective phoneme and one frame, a development of the activity of each cell at each frame in the neural net model is suppressed by the activities of other cells on the same frame corresponding to different phonemes, and the development of the activity of each cell at each frame being enhanced by the activities of other cells corresponding to the same phoneme at different frames. In the process, the phoneme of a cell that has developed the highest activity is determined as a winner at the corresponding frame to produce a list of winners at respective frames. A phoneme is outputted as a recognition result for the input signal in accordance with the list of the winners at the respective frames that have been determined in the step of processing.

机译：提供了一种用于语音识别的方法和系统。语音识别方法包括以下步骤：准备表示每个时间帧的每个音素的声学参数的训练数据;接收表示要识别的声音的输入信号，并将该输入信号转换为输入数据;将每一帧的输入数据与每个音素的训练数据进行比较，以得出关于每个音素的输入数据的相似性度量;使用神经网络模型处理在比较步骤中获得的相似性度量，该神经网络模型控制着多个单元的活动发展以对输入信号进行语音识别。在处理步骤中，每个单元分别与一个音素和一个帧相关联，在神经网络模型的每个帧中，每个单元的活动的发展被同一帧中对应于不同音素的其他单元的活动所抑制，并且通过在不同帧处对应于相同音素的其他细胞的活动来增强每个帧在每个帧处的活动的发展。在此过程中，已将活动度最高的单元的音素确定为相应帧的获胜者，以生成各个帧中的获胜者列表。根据在处理步骤中确定的各个帧上的获胜者的列表，输出音素作为输入信号的识别结果。

著录项

公开/公告号US6947890B1

专利类型
公开/公告日2005-09-20

原文格式PDF
申请/专利权人 TETSURO KITAZOE;SUNG-ILL KIM;TOMOYUKI ICHIKI;
展开▼

申请/专利号US20000580449
发明设计人 SUNG-ILL KIM;TETSURO KITAZOE;TOMOYUKI ICHIKI;
展开▼

申请日2000-05-30
分类号G10L15/16;
国家 US
入库时间 2022-08-21 22:20:02

相似文献

专利
外文文献
中文文献