首页> 外文学位 >Adaptive Boosting for Automatic Speech Recognition.
【24h】

Adaptive Boosting for Automatic Speech Recognition.

机译:用于自动语音识别的自适应增强。

获取原文
获取原文并翻译 | 示例

摘要

The atomic units of most automatic speech recognition (ASR) systems are the phonemes. However, the most widely used features in ASR are perceptual linear prediction (PLP) and mel-frequency cepstral coefficients (MFCC), which do not carry the phoneme information explicitly. The discriminative features with phoneme information have been shown more powerful for ASR accuracy. The process of generating the discriminative features relies on training classifiers to transform the original features to a new probabilistic features.;One of most commonly used techniques for measuring the probabilities in continuous distributions is Gaussian mixture models (GMM). In this work, the GMM-based classifier is used to convert each acoustic feature vector to a posterior probability vector given all classes. Furthermore, an adaptive boosting (AdaBoost) algorithm is applied to combine the classifiers to enhance the performance.;The training of GMM-based AdaBoost classifiers requires very expensive computation. To make it feasible for very large vocabulary speech recognition systems with thousands of hours of training data, we have implemented a hierarchical AdaBoost to split the whole training to multiple parallel processes. The speed up reduced the training data time from about more 100 days to within a week.;The AdaBoost features were then used successfully to combine with spectral feature for ASR. Compared to the baseline of the standard features, the AdaBoost system reduced the word-error-rate (WER) by 2%. Moreover, the AdaBoost system also contributed consistent gains on the system combination even compared with a very strong baseline.
机译:大多数自动语音识别(ASR)系统的原子单位是音素。但是,ASR中使用最广泛的功能是感知线性预测(PLP)和梅尔频率倒谱系数(MFCC),它们没有明确携带音素信息。具有音素信息的判别功能已显示出对ASR准确性更强大的功能。生成判别特征的过程依赖于训练分类器,以将原始特征转换为新的概率特征。测量连续分布概率的最常用技术之一是高斯混合模型(GMM)。在这项工作中,基于GMM的分类器用于将所有声学特征向量转换为给定所有类别的后验概率向量。此外,还采用了自适应增强算法(AdaBoost)来组合分类器以提高性能。基于GMM的AdaBoost分类器的训练需要非常昂贵的计算。为了使具有数千小时训练数据的大型词汇语音识别系统可行,我们实施了分层AdaBoost,将整个训练分为多个并行过程。速度的加快将训练数据的时间从大约100天减少到一周之内。;然后,AdaBoost功能已成功用于频谱和ASR的组合。与标准功能的基准相比,AdaBoost系统将字错误率(WER)降低了2%。此外,即使与非常强的基准相比,AdaBoost系统也为系统组合带来了持续的收益。

著录项

  • 作者

    Nguyen, Kham.;

  • 作者单位

    Northeastern University.;

  • 授予单位 Northeastern University.;
  • 学科 Electrical engineering.;Computer engineering.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 156 p.
  • 总页数 156
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号