首页> 外文学位 >Speech recognition based on phonetic features and acoustic landmarks.
【24h】

Speech recognition based on phonetic features and acoustic landmarks.

机译:基于语音特征和声学界标的语音识别。

获取原文
获取原文并翻译 | 示例

摘要

A probabilistic and statistical framework is presented for automatic speech recognition based on a phonetic feature representation of speech sounds. In this acoustic-phonetic approach, the speech recognition problern is hypothesized as a maximization of the joint posterior probability of a set of phonetic features and the corresponding acoustic landmarks. Binary classifiers of the manner phonetic features---syllabic, sonorant and continuant---are applied for the probabilistic detection of speech landmarks. The landmarks include stop bursts, vowel onsets, syllabic peaks, syllabic dips, fricative onsets and offsets; and sonorant consonant onsets and offsets. The classifiers use automatically extracted knowledge based acoustic parameters (APs) that are acoustic correlates of those phonetic features. For isolated word recognition with known and limited vocabulary, the landmark sequences are constrained using a manner class pronunciation graph. Probabilistic decisions on place and voicing phonetic features are then made using a separate set of APs extracted using the landmarks.; The framework exploits two properties of the knowledge-based acoustic cues of phonetic features: (1) sufficiency of the acoustic cues of a phonetic feature for a decision on that feature and (2) invariance of the acoustic cues with respect to context. The probabilistic framework makes the acoustic-phonetic approach to speech recognition suitable for practical recognition tasks as well as compatible with probabilistic pronunciation and language models. Support vector machines (SVMs) are applied for the binary classification tasks because of their two favorable properties---good generalization and the ability to learn from a relatively small amount of high dimensional data. Performance comparable to Hidden Markov Model (HMM) based systems is obtained on landmark detection as well as isolated word recognition. Applications to restoring of lattices from a large vocabulary continuous speech recognizer are also presented.
机译:提出了一种概率统计框架,用于基于语音的语音特征表示的自动语音识别。在这种声学方法中,语音识别问题被假定为一组语音特征和相应声学界标的联合后验概率的最大化。语音特征方式的二元分类器-音节,回音和连续-用于语音界标的概率检测。地标性特征包括停止爆发,元音发作,音节峰值,音节骤降,摩擦音发作和偏移。和son谐的辅音起音和偏移。分类器使用自动提取的基于知识的声学参数(AP),这些参数是这些语音特征的声学关联。对于具有已知和有限词汇量的孤立单词识别,使用方式类发音图来限制界标序列。然后,使用由地标提取的一组单独的AP来做出关于位置和发声语音特征的概率决策。该框架利用了基于语音特征的基于知识的语音提示的两个属性:(1)语音特征的语音提示是否足以决定该功能,以及(2)语音提示相对于上下文的不变性。概率框架使语音识别的语音方法适合于实际的识别任务,并且与概率发音和语言模型兼容。支持向量机(SVM)由于具有两个良好的特性-良好的泛化能力和从相对少量的高维数据中学习的能力而被用于二进制分类任务。在地标检测以及孤立的单词识别方面,可以获得与基于隐马尔可夫模型(HMM)的系统相当的性能。还介绍了从大词汇量连续语音识别器还原晶格的应用。

著录项

  • 作者

    Juneja, Amit.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 169 p.
  • 总页数 169
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

  • 入库时间 2022-08-17 11:44:04

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号