首页> 外国专利> Wavelet-based energy binning cepstal features for automatic speech recognition

Wavelet-based energy binning cepstal features for automatic speech recognition

机译：基于小波的能量合并倒频谱特征用于自动语音识别

页面导航

摘要
著录项
相似文献

摘要

Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves “synchrosqueezing” spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the components are, thus, extracted. The cepstrum generated from this information is referred to as “K-mean Wastrum.” In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as “formant-based wastrum.” Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added. This method requires adequate formant tracking. The resulting robust formant extraction has a number of applications in speech processing and analysis including vocal tract normalization.

机译：利用小波变换（以及傅立叶变换）作为基本工具的用于处理语音信号的系统和方法。该方法实质上涉及“同步压缩”。通过对数字化语音信号执行小波变换（或傅立叶变换）获得的频谱分量数据。一方面，通过K-均值聚类算法动态跟踪同步压缩平面的频谱分量。因此，提取每个分量的幅度，频率和带宽。根据该信息生成的倒谱称为“ K均值Wastrum”。在另一方面，K均值聚类处理的结果被进一步处理以将主要成分的集合限制为共振峰。所得到的特征称为“基于共振峰的瓦兹隆”。在共振峰区域内插入共振峰，并添加频谱中湍流部分的贡献。此方法需要足够的共振峰跟踪。生成的鲁棒共振峰提取物在语音处理和分析（包括声道归一化）中具有许多应用。

著录项

公开/公告号US6253175B1

专利类型
公开/公告日2001-06-26

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US19980201055
发明设计人 STEPHANE H. MAES;SANKAR BASU;
展开▼

申请日1998-11-30
分类号G10L150/00;
国家 US
入库时间 2022-08-22 01:03:59

相似文献

专利
外文文献
中文文献