首页> 外国专利> APPARATUS AND METHOD FOR EXTRACTING NOISE-ROBUST THE SPEECH RECOGNITION VECTOR SHARING THE PREPROCESSING STEP USED IN SPEECH CODING

APPARATUS AND METHOD FOR EXTRACTING NOISE-ROBUST THE SPEECH RECOGNITION VECTOR SHARING THE PREPROCESSING STEP USED IN SPEECH CODING

机译：提取语音识别中共享的预处理步骤的噪声-鲁棒矢量的装置和方法

页面导航

摘要
著录项
相似文献

摘要

An apparatus and a method for extracting noise-robust speech feature vectors by sharing a preprocessing step of a speech coder in a distributed speech recognition terminal are provided to share the preprocessing step for speech communication and speech recognition, thereby improving speech recognition performance as consuming little amount of memory of a lower spec terminal and reducing an operation amount. A channel SNR(Signal-to-Noise Rate) estimation module(24) estimates a channel SNR of a speech signal based on a channel energy estimation value calculated by a channel energy estimation module(23) and a background noise energy estimation value calculated by a background noise estimation module(30). A voice metric calculation module(25) calculates a sum of speech metrics on a channel about the speech signal based on the channel SNR estimated by the channel SNR estimation module. A spectral deviation estimation module(26) estimates a spectrum deviation of the speech signal based on the channel energy estimation value calculated in the channel energy estimation module. A noise update decision module(27) gives a noise estimation value update command based on a difference value among the channel energy estimation value, an estimation value for a current power spectrum, and an estimation value for an average long interval power spectrum. A channel SNR modifier(28) modifies the channel SNR estimated by the channel SNR estimation module based on the sum of the voice metrics. A channel gain computation module(29) computes a linear channel gain based on the modified channel SNR and the background noise energy estimation value. A frequency domain filter(31) applies the linear channel gain to a spectrum signal converted by a frequency domain converter.

机译：提供了一种通过在分布式语音识别终端中共享语音编码器的预处理步骤来提取噪声鲁棒语音特征向量的装置和方法，以共享用于语音通信和语音识别的预处理步骤，从而以少消耗一点语音的方式提高了语音识别性能。规格较低的终端的存储量减少了操作量。信道SNR（信噪比）估计模块（24）基于由信道能量估计模块（23）计算出的信道能量估计值和由信道能量估计模块计算出的背景噪声能量估计值来估计语音信号的信道SNR。背景噪声估计模块（30）。语音度量计算模块（25）基于由信道SNR估计模块估计的信道SNR计算关于语音信号的信道上的语音度量之和。频谱偏差估计模块（26）基于在信道能量估计模块中计算出的信道能量估计值来估计语音信号的频谱偏差。噪声更新判定模块（27）基于信道能量估计值，当前功率谱的估计值和平均长间隔功率谱的估计值之间的差值，给出噪声估计值更新命令。信道SNR修改器（28）基于语音度量的总和来修改由信道SNR估计模块估计的信道SNR。信道增益计算模块（29）基于修改后的信道SNR和背景噪声能量估计值来计算线性信道增益。频域滤波器（31）将线性信道增益应用于由频域转换器转换的频谱信号。

著录项

公开/公告号KR20080002359A

专利类型
公开/公告日2008-01-04

原文格式PDF
申请/专利权人 KT CORPORATION;
展开▼

申请/专利号KR20060061150
发明设计人 OH YOU LEE;KIM JAE IN;RYU CHANG SUN;YOON JAE SAM;KIM HONG KOOK;
展开▼

申请日2006-06-30
分类号G10L15/20;G10L15/28;
国家 KR
入库时间 2022-08-21 19:54:22

相似文献

专利
外文文献
中文文献