首页> 外文学位 >A study on the use of conditional random fields for automatic speech recognition.
【24h】

A study on the use of conditional random fields for automatic speech recognition.

机译:关于使用条件随机场进行自动语音识别的研究。

获取原文
获取原文并翻译 | 示例

摘要

Current state of the art systems for Automatic Speech Recognition (ASR) use statistical modeling techniques such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to recognize spoken language. These techniques make use of statistics derived from the acoustic frequencies of the speech signal. In recent years, interest has been rising in the use of phonological features derived from these acoustic frequency features in addition to, or in place of, the acoustic frequency features themselves. These phonological features are derived from the manner that speech is physically produced in the vocal tract of the speaker, rather than models of how speech is heard by the listener.;Integrating phonological features into ASR models presents new challenges. The mathematical assumptions made to build current models may work well for features derived from acoustic frequencies, but do not necessarily fit phonological features as nicely. Explorations into how to alter the mathematical models to allow for this new type of input feature is an ongoing area of ASR research. This dissertation examines the use of the statistical model known as a Conditional Random Field (CRF) for ASR using phonological features. CRFs are statistical models of sequences that are similar to HMMs, but CRF models do not make any assumptions about the independence or interdependence of the data being modeled.;This dissertation provides (1) a CRF-based pilot system is able to achieve superior performance in a phonetic recognition task to a comparably configured HMM model, and achieve this performance with many fewer parameters, (2) an extension of this model to create new features for an HMM-based system for word recognition, and (3) a fully developed system for word recognition using CRFs.
机译:用于自动语音识别(ASR)的最新系统使用统计建模技术,例如隐马尔可夫模型(HMM)和高斯混合模型(GMM)来识别口语。这些技术利用了从语音信号的声频导出的统计数据。近年来,除了或代替声频特征本身之外,人们对使用从这些声频特征导出的语音特征的兴趣也越来越高。这些语音特征源于在说话者的声道中物理产生语音的方式,而不是听众如何听语音的模型。将语音特征整合到ASR模型中提出了新的挑战。建立当前模型的数学假设可能适用于从声频导出的特征,但不一定能很好地拟合语音特征。探索如何更改数学模型以允许这种新型输入功能是ASR研究的一个持续领域。本文研究了使用语音特性的统计模型在ASR中的应用。 CRF是类似于HMM的序列统计模型,但是CRF模型没有对要建模的数据的独立性或相互依赖性做任何假设。论文提供了(1)基于CRF的先导系统能够实现卓越的性能将语音识别任务分配给配置相同的HMM模型,并通过更少的参数实现此性能;(2)对该模型的扩展,以为基于HMM的单词识别系统创建新功能;(3)全面开发CRF的单词识别系统。

著录项

  • 作者

    Morris, Jeremy J.;

  • 作者单位

    The Ohio State University.;

  • 授予单位 The Ohio State University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 145 p.
  • 总页数 145
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号