首页>
外国专利>
Estimating speaker-specific affine transforms for neural network based speech recognition systems
Estimating speaker-specific affine transforms for neural network based speech recognition systems
展开▼
机译:为基于神经网络的语音识别系统估计说话者特定的仿射变换
展开▼
页面导航
摘要
著录项
相似文献
摘要
Features are disclosed for estimating affine transforms in Log Filter-Bank Energy Space (“LFBE” space) in order to adapt artificial neural network-based acoustic models to a new speaker or environment. Neural network-based acoustic models may be trained using concatenated LFBEs as input features. The affine transform may be estimated by minimizing the least squares error between corresponding linear and bias transform parts for the resultant neural network feature vector and some standard speaker-specific feature vector obtained for a GMM-based acoustic model using constrained Maximum Likelihood Linear Regression (“cMLLR”) techniques. Alternatively, the affine transform may be estimated by minimizing the least squares error between the resultant transformed neural network feature and some standard speaker-specific feature obtained for a GMM-based acoustic model.
展开▼