首页> 外国专利> METHOD AND SYSTEM FOR SPEECH RECONSTRUCTION FROM SPEECH RECOGNITION FEATURES

METHOD AND SYSTEM FOR SPEECH RECONSTRUCTION FROM SPEECH RECOGNITION FEATURES

机译：从语音识别特征重构语音的方法和系统

页面导航

摘要
著录项
相似文献

摘要

A speech reconstruction method for converting a series of feature vectors and a series of respective pitch values and voicing decisions of an original input speech signal into a speech signal, the feature vectors being obtained as follows: (i) deriving at successive instances of time an estimate of a spectral envelope SE(i), i being a frequency index, of the digitized original speech signal, (ii) multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions, BW(i, k), i being a frequency index and k being the window function index, wherein each window is non-zero over a narrow range of frequencies, and computing the integrals thereof, according to the expression: where BI(k) is defined as the kth component or " bin" of a " binned spectrum" , and (iii) assigning said integrals or a set of pre-determined functions thereof to respective components of a corresponding feature vector in a series of feature vectors; said speech reconstruction method comprising: (a) converting each feature vector into a binned spectrum, (b) generating harmonic frequencies and weights according to the corresponding 1438 א' בתמוז התשס" ד - June 20, 2004 pitch and voicing decision, (c) generating for each harmonic frequency a respective phase, depending on the corresponding pitch value and voicing decision and possibly on the binned spectrum, (d) sampling a predetermined set of basis functions each being a function in a set of frequency domain functions with bounded supports at all harmonic frequencies which are within its support, and multiplying by the respective harmonic weight, so as to produce for each sampled basis function a respective line spectrum having multiple components, (e) combining each component of each respective line spectrum with the respective phase thereof so as to produce a complex line spectrum for each basis function, (f) generating gain coefficients of the basis functions, (g) multiplying the complex line spectrum of each basis function by the respective basis function gain coefficient, and summing up all resulting complex line spectra to generate a single complex line spectrum having a respective component for each of the harmonic frequencies, and (h) generating a time signal from complex line spectra computed at successive instances of time. 1439 א' בתמוז התשס" ד - June 20, 2004

机译：一种将原始输入语音信号的一系列特征向量和一系列各自的音调值和配音决定转换为语音信号的语音重构方法，其特征向量的获取方式如下：（i）在时间的连续实例中得出数字化原始语音信号的频谱包络SE（i）的估计，i是频率索引，（ii）将频谱包络的每个估计乘以一组预定的频域窗口函数BW（i，k）， i是频率索引，k是窗口函数索引，其中每个窗口在一个狭窄的频率范围内都不为零，并根据以下表达式计算其积分：其中BI（k）被定义为第k个分量;或者（bin）的“ bin”，以及（iii）将所述积分或其一组预定函数分配给一系列特征向量中的相应特征向量的各个分量;所述语音重建方法，包括：（a）将每个特征向量转换成二进制频谱，（b）根据相应的1438“ 2004年6月20日”的音调和发声决定，生成谐波频率和权重，（c）根据相应的音调值和语音决定以及可能的合并频谱，为每个谐波频率生成一个相应的相位，（d）对一组预定的基函数进行采样，每个基函数是一组频域函数中的函数，在在其支持范围内的所有谐波频率，并乘以相应的谐波权重，以便为每个采样基函数生成具有多个分量的相应线谱，（e）将每个相应线谱的每个分量与其相位相结合以便为每个基本函数生成复杂的线谱，（f）生成基本函数的增益系数，（g）多通过相应的基函数增益系数对每个基函数的复线谱进行ipip，并对所有得到的复线谱求和，以生成具有每个谐波频率各自分量的单个复线谱，并且（h）生成时间在连续的时间实例中计算出的复杂线谱产生的信号。 1439א'בתמוזהתשס"תשס-2004年6月20日

著录项

公开/公告号IL135192B

专利类型
公开/公告日2004-06-20

原文格式PDF
申请/专利权人 DAN CHAZAN;GILAD COHEN;RON HOORY;NUANCE COMMUNICATIONS INC.;INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号IL135192
发明设计人 DAN CHAZAN;RON HOORY;GILAD COHEN;
展开▼

申请日2000-03-21
分类号G10L15/26;
国家 IL
入库时间 2022-08-21 23:10:43

相似文献

专利
外文文献
中文文献