首页> 外文OA文献 >Automatic determination of sub-word units for automatic speech recognition
【2h】

Automatic determination of sub-word units for automatic speech recognition

机译:自动确定用于自动语音识别的子词单位

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Current automatic speech recognition (ASR) research is focused on recognition of continuous,udspontaneous speech. Spontaneous speech contains a lot of variability in theudway words are pronounced, and canonical pronunciations of each word are not true toudthe variation that is seen in real data.udTwo of the components of an ASR system are acoustic models and pronunciationudmodels. The variation within spontaneous speech must be accounted for by theseudcomponents. Phones, or context-dependent phones are typically used as the base subwordudunit, and one acoustic model is trained for each sub-word unit. Pronunciationudmodelling largely takes place in a dictionary, which relates words to sequences of phones.udAcoustic modelling and pronunciation modelling overlap, and the two are not clearlyudseparable in modelling pronunciation variation. Techniques that find pronunciationudvariants in the data and then reflect these in the dictionary have not provided expectedudgains in recognition.udAn alternative approach to modelling pronunciations in terms of phones is to deriveudunits automatically: using data-driven methods to determine an inventory of sub-wordudunits, their acoustic models, and their relationship to words. This thesis presents audmethod for the automatic derivation of a sub-word unit inventory, whose main componentsudareud1. automatic and simultaneous generation of a sub-word unit inventory and acousticudmodel set, using an ergodic hidden Markov model whose complexity is controlledudusing the Bayesian Information Criterionud2. automatic generation of probabilistic dictionaries using joint multigramsudThe prerequisites of this approach are fewer than in previous work on unit derivation;udnotably, the timings of word boundaries are not required here. The approach is languageudindependent since it is entirely data-driven and no linguistic information is required.udThe dictionary generation method outperforms a supervised method using phoneticuddata. The automatically derived units and dictionary perform reasonably on a smalludspontaneous speech task, although not yet outperforming phones.
机译:当前的自动语音识别(ASR)研究集中在识别连续的自发语音。自发性语音包含很多变化, udway单词的发音,每个单词的规范发音都不符合 ud真实数据中看到的变化。 ud ASR系统的两个组成部分是声学模型和发音 udmodels。这些 udcomponents必须解释自发语音中的变化。电话或上下文相关电话通常用作基本子字 udunit,并且为每个子字单元训练一个声学模型。发音 udmodelling主要发生在字典中,该词典将单词与电话序列相关联。 ud声学建模和发音建模重叠,并且在建模发音变体时这两者显然不是不可分割的。在数据中找到发音 udvariant然后将其反映在字典中的技术并未提供预期的 udgain识别。 ud对电话发音建模的另一种方法是自动派生 udunit:使用数据驱动的方法来确定子词 udunit,其声学模型及其与词的关系的清单。本文提出了一种自动导出子词单位清单的方法,其主要组成部分是胆量。使用遍历式隐马尔可夫模型自动并同时生成子单词单元清单和声学 udmodel集,该模型的复杂度由贝叶斯信息准则 ud2控制。使用联合多义字自动生成概率词典 ud这种方法的前提条件比以前关于单位推导的工作要少; 显然,这里不需要单词边界的计时。该方法是语言 udin无关的,因为它完全是数据驱动的,不需要语言信息。 ud字典生成方法优于使用语音 uddata的监督方法。自动派生的单位和词典在完成小型自发的语音任务上可以合理地执行,尽管还不及手机。

著录项

  • 作者

    Couper Kenney Fiona;

  • 作者单位
  • 年度 2008
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号