首页> 外文学位 >Code breaking for automatic speech recognition.
【24h】

Code breaking for automatic speech recognition.

机译:用于自动语音识别的密码破解。

获取原文
获取原文并翻译 | 示例

摘要

Code Breaking is a divide and conquer approach for sequential pattern recognition tasks where we identify weaknesses of an existing system and then use specialized decoders to strengthen the overall system. We study the technique in the context of Automatic Speech Recognition. Using the lattice cutting algorithm, we first analyze lattices generated by a state-of-the-art speech recognizes to spot possible errors in its first-pass hypothesis. We then train specialized decoders for each of these problems and apply them to refine the first-pass hypothesis.; We study the use of Support Vector Machines (SVMs) as discriminative models over each of these problems. The estimation of a posterior distribution over hypothesis in these regions of acoustic confusion is posed as a logistic regression problem. GiniSVMs, a variant of SVMs, can be used as an approximation technique to estimate the parameters of the logistic regression problem.; We first validate our approach on a small vocabulary recognition task, namely, alphadigits. We show that the use of GiniSVMs can substantially improve the performance of a well trained MMI-HMM system. We also find that it is possible to derive reliable confidence scores over the GiniSVM hypotheses and that these can be used to good effect in hypothesis combination.; We will then analyze lattice cutting in terms of its ability to reliably identify, and provide good alternatives for incorrectly hypothesized words in the Czech MALACH domain, a large vocabulary task. We describe a procedure to train and apply SVMs to strengthen the first pass system, resulting in small but statistically significant recognition improvements. We conclude with a discussion of methods including clustering for obtaining further improvements on large vocabulary tasks.
机译:代码破解是一种用于顺序模式识别任务的分而治之的方法,其中我们先确定现有系统的弱点,然后使用专门的解码器来增强整个系统。我们在自动语音识别的上下文中研究该技术。使用晶格切割算法,我们首先分析由最新语音识别器生成的晶格,以发现其首过假设中的可能错误。然后,我们针对这些问题中的每一个训练专用的解码器,并应用它们来完善首过假设。我们研究了使用支持向量机(SVM)作为针对这些问题的判别模型。在声学混淆的这些区域中对假设的后验分布的估计被认为是逻辑回归问题。 GiniSVM是SVM的一种,可以用作一种近似技术来估计逻辑回归问题的参数。我们首先在一个小的词汇识别任务上验证我们的方法,即字母数字。我们表明,使用GiniSVM可以大大提高训练有素的MMI-HMM系统的性能。我们还发现,有可能得出关于GiniSVM假设的可靠置信度得分,并且这些得分可以在假设组合中发挥良好的作用。然后,我们将根据其可靠识别的能力来分析晶格切割,并为捷克语MALACH域中的错误假设单词(一个庞大的词汇量任务)提供良好的替代方法。我们描述了一种训练和应用SVM来增强首过系统的过程,从而导致了很小但在统计上显着的识别改进。最后,我们讨论了包括聚类在内的方法,以对大型词汇任务进行进一步的改进。

著录项

  • 作者

    Venkataramani, Veera.;

  • 作者单位

    The Johns Hopkins University.;

  • 授予单位 The Johns Hopkins University.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 123 p.
  • 总页数 123
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号