Adaptive Boosting for Automatic Speech Recognition.

机译：用于自动语音识别的自适应增强。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The atomic units of most automatic speech recognition (ASR) systems are the phonemes. However, the most widely used features in ASR are perceptual linear prediction (PLP) and mel-frequency cepstral coefficients (MFCC), which do not carry the phoneme information explicitly. The discriminative features with phoneme information have been shown more powerful for ASR accuracy. The process of generating the discriminative features relies on training classifiers to transform the original features to a new probabilistic features.;One of most commonly used techniques for measuring the probabilities in continuous distributions is Gaussian mixture models (GMM). In this work, the GMM-based classifier is used to convert each acoustic feature vector to a posterior probability vector given all classes. Furthermore, an adaptive boosting (AdaBoost) algorithm is applied to combine the classifiers to enhance the performance.;The training of GMM-based AdaBoost classifiers requires very expensive computation. To make it feasible for very large vocabulary speech recognition systems with thousands of hours of training data, we have implemented a hierarchical AdaBoost to split the whole training to multiple parallel processes. The speed up reduced the training data time from about more 100 days to within a week.;The AdaBoost features were then used successfully to combine with spectral feature for ASR. Compared to the baseline of the standard features, the AdaBoost system reduced the word-error-rate (WER) by 2%. Moreover, the AdaBoost system also contributed consistent gains on the system combination even compared with a very strong baseline.

机译：大多数自动语音识别（ASR）系统的原子单位是音素。但是，ASR中使用最广泛的功能是感知线性预测（PLP）和梅尔频率倒谱系数（MFCC），它们没有明确携带音素信息。具有音素信息的判别功能已显示出对ASR准确性更强大的功能。生成判别特征的过程依赖于训练分类器，以将原始特征转换为新的概率特征。测量连续分布概率的最常用技术之一是高斯混合模型（GMM）。在这项工作中，基于GMM的分类器用于将所有声学特征向量转换为给定所有类别的后验概率向量。此外，还采用了自适应增强算法（AdaBoost）来组合分类器以提高性能。基于GMM的AdaBoost分类器的训练需要非常昂贵的计算。为了使具有数千小时训练数据的大型词汇语音识别系统可行，我们实施了分层AdaBoost，将整个训练分为多个并行过程。速度的加快将训练数据的时间从大约100天减少到一周之内。;然后，AdaBoost功能已成功用于频谱和ASR的组合。与标准功能的基准相比，AdaBoost系统将字错误率（WER）降低了2％。此外，即使与非常强的基准相比，AdaBoost系统也为系统组合带来了持续的收益。

著录项

作者
Nguyen, Kham.;
展开▼
作者单位

Northeastern University.;

展开▼
授予单位 Northeastern University.;
学科 Electrical engineering.;Computer engineering.
学位 Ph.D.
年度 2016
页码 156 p.
总页数 156
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Do We Need STRFs for Cocktail Parties? On the Relevance of Physiologically Motivated Features for Human Speech Perception Derived from Automatic Speech Recognition. [J] . B Kollmeier, M R René Sch?dler, A Meyer, Advances in Experimental Medicine and Biology . 2013,第Null期

机译：鸡尾酒会需要STRF吗？生理动机特征与自动语音识别衍生的人类语音感知的相关性。
2. Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition. [J] . Schuster M, Maier A, Haderlein T, International journal of pediatric otorhinolaryngology . 2006,第10期

机译：通过自动语音识别评估唇left裂儿童的语音清晰度。
3. Fractal dimensions of speech sounds: computation and application to automatic speech recognition. [J] . Maragos P, Potamianos A The Journal of the Acoustical Society of America . 1999,第3期

机译：语音的分形维数：自动语音识别的计算和应用。
4. Adaptive boosting features for automatic speech recognition [C] . Nguyen Kham IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP . 2012

机译：自适应增强功能可实现自动语音识别
5. Boosting Methods for Automatic Segmentation of Focal Liver Lesions =Boosting-Verfahren zur automatischen Segmentierung fokaler Leberl?sionen [D] . Militzer, Arne. 2015

机译：局灶性肝病变自动分割的促进方法=自动分割焦点激发的升压程序
6. Adaptive Data Boosting Technique for Robust Personalized Speech Emotion in Emotionally-Imbalanced Small-Sample Environments [O] . Jaehun Bang, Taeho Hur, Dohyeong Kim, 2018

机译：自适应数据增强技术在情绪不平衡的小样本环境中提供强大的个性化语音情感
7. Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition. [O] . Philip N. Garner 2013

机译：自动语音识别中的倒谱归一化和信噪比频谱。
8. Multilingual Techniques for Low Resource Automatic Speech Recognition. [R] . Chuangsuwanich, E. 2016

机译：低资源自动语音识别的多语言技巧。

Adaptive Boosting for Automatic Speech Recognition.

摘要

著录项

相似文献

相关主题

期刊订阅