首页> 外国专利> Segment-based similarity method for low complexity speech recognizer

Segment-based similarity method for low complexity speech recognizer

机译:低复杂度语音识别器的基于段的相似度方法

摘要

A digital word prototype is constructed using one or more speech utterance for a given spoken word or phrase. First, a phone model is used to derive phoneme similarity time series for each of a plurality of phonemes which represent the degree of similarity between the speech utterance and a set of standard phonemes contained in the phone model. Next, the phoneme similarity data is normalized in relation to a non-speech part of the input speech signal. The normalized phoneme similarity data is divided into segments, such that the sum of all normalized phoneme similarity values in a segment are equal for each segment. Next, a word model is constructed from the phoneme similarity data. To do so, within each segment, a summation value is determined by summing over speech frames each of the normalized phoneme similarity values associated with a particular phoneme. In this way, the word model is represented by a vector of summation values that compactly correlate to the normalized phoneme similarity data. Lastly, the results of the individually processed utterances for a given spoken word (i.e., the individual word models) are combined to produce a digital word prototype that electronically represents the given spoken word.
机译:使用给定口语单词或短语的一种或多种语音发音来构建数字单词原型。首先,电话模型用于导出多个音素中每个音素的音素相似性时间序列,这些时间序列表示语音话语与该电话模型中包含的一组标准音素之间的相似度。接下来,相对于输入语音信号的非语音部分对音素相似性数据进行归一化。归一化的音素相似性数据被划分为段,使得对于每个片段,一个片段中的所有归一化的音素相似性值的总和相等。接下来,根据音素相似度数据构造单词模型。为此,在每个段内,通过对语音帧求和与特定音素相关联的每个归一化音素相似度值来确定总和值。以这种方式,单词模型由和值的向量表示,该向量与归一化的音素相似性数据紧密相关。最后,将给定口语单词(即,各个单词模型)的单独处理的话语的结果组合起来,以产生电子地表示给定口语单词的数字单词原型。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号