首页> 外国专利> A speech recognition system and method which mimics transform parameters and estimates the mimicked transform parameters

A speech recognition system and method which mimics transform parameters and estimates the mimicked transform parameters

机译：一种模仿变换参数并估计模仿变换参数的语音识别系统和方法

页面导航

摘要
著录项
相似文献

摘要

A speech recognition method comprising, receiving a speech input in a first noise environment which comprises a sequence of observations and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model. The acoustic model comprising providing an acoustic model for performing speech recognition on a input signal which comprises a sequence of observations, wherein said model has been trained to recognise speech in a second noise environment, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to an observation. It also comprises adapting the model trained in the second environment to that of the first environment; the speech recognition method further comprising determining the likelihood of a sequence of observations occurring in a given language using a language model, combining the likelihoods determined by the acoustic model and the language model and outputting a sequence of words identified from said speech input signal. Adapting the model trained in the second environment to that of the first environment comprises adapting the model parameters of the model trained in the second noise environment to those of the first noise environment using transform parameters to produce a target distribution, wherein the transform parameters have a block diagonal form and are applied to regression classes, each regression class comprising a plurality of probability distributions and mimicking the target distribution using a linear regression type distribution, said linear regression type distribution comprising mimicked transform parameters and estimating the mimicked transformed parameters. The invention aims to derive a speech recognition method that is computationally on a par with a joint uncertainty decoding (JUD) method but which achieves accuracy similar to that of Vector Taylor Series (VTS) methods.

机译：一种语音识别方法，包括：在包括观察序列的第一噪声环境中接收语音输入，并使用声学模型确定从所述观察序列中产生的单词序列的可能性。所述声学模型包括提供用于对包括一系列观察的输入信号执行语音识别的声学模型，其中，所述模型已经被训练为在第二噪声环境中识别语音，所述模型具有与概率有关的多个模型参数。与观察有关的词或其一部分的分布。它还包括使在第二环境中训练的模型适应于第一环境的模型;语音识别方法还包括使用语言模型确定在给定语言中发生的一系列观察的可能性，将由声学模型和语言模型确定的可能性进行组合，并输出从所述语音输入信号中识别出的单词序列。使在第二环境中训练的模型适应于第一环境的模型包括使用变换参数来使在第二噪声环境中训练的模型的模型参数与第一噪声环境的模型参数相适应以产生目标分布，其中，变换参数具有块对角线形式并应用于回归类别，每个回归类别包括多个概率分布，并使用线性回归类型分布模拟目标分布，所述线性回归类型分布包括模拟的变换参数并估计模拟的变换参数。本发明的目的在于获得一种语音识别方法，该语音识别方法在计算上与联合不确定性解码（JUD）方法相当，但是其准确性与矢量泰勒级数（VTS）方法相似。

著录项

公开/公告号GB2471875A

专利类型
公开/公告日2011-01-19

原文格式PDF
申请/专利权人 TOSHIBA RESEARCH EUROPE LIMITED;
展开▼

申请/专利号GB20090012319
发明设计人 MARK JOHN FRANCIS GALES;HAITAN XU;
展开▼

申请日2009-07-15
分类号G10L21/02;G10L15/20;
国家 GB
入库时间 2022-08-21 17:45:09

相似文献

专利
外文文献
中文文献