Code breaking for automatic speech recognition.

机译：用于自动语音识别的密码破解。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Code Breaking is a divide and conquer approach for sequential pattern recognition tasks where we identify weaknesses of an existing system and then use specialized decoders to strengthen the overall system. We study the technique in the context of Automatic Speech Recognition. Using the lattice cutting algorithm, we first analyze lattices generated by a state-of-the-art speech recognizes to spot possible errors in its first-pass hypothesis. We then train specialized decoders for each of these problems and apply them to refine the first-pass hypothesis.; We study the use of Support Vector Machines (SVMs) as discriminative models over each of these problems. The estimation of a posterior distribution over hypothesis in these regions of acoustic confusion is posed as a logistic regression problem. GiniSVMs, a variant of SVMs, can be used as an approximation technique to estimate the parameters of the logistic regression problem.; We first validate our approach on a small vocabulary recognition task, namely, alphadigits. We show that the use of GiniSVMs can substantially improve the performance of a well trained MMI-HMM system. We also find that it is possible to derive reliable confidence scores over the GiniSVM hypotheses and that these can be used to good effect in hypothesis combination.; We will then analyze lattice cutting in terms of its ability to reliably identify, and provide good alternatives for incorrectly hypothesized words in the Czech MALACH domain, a large vocabulary task. We describe a procedure to train and apply SVMs to strengthen the first pass system, resulting in small but statistically significant recognition improvements. We conclude with a discussion of methods including clustering for obtaining further improvements on large vocabulary tasks.

机译：代码破解是一种用于顺序模式识别任务的分而治之的方法，其中我们先确定现有系统的弱点，然后使用专门的解码器来增强整个系统。我们在自动语音识别的上下文中研究该技术。使用晶格切割算法，我们首先分析由最新语音识别器生成的晶格，以发现其首过假设中的可能错误。然后，我们针对这些问题中的每一个训练专用的解码器，并应用它们来完善首过假设。我们研究了使用支持向量机（SVM）作为针对这些问题的判别模型。在声学混淆的这些区域中对假设的后验分布的估计被认为是逻辑回归问题。 GiniSVM是SVM的一种，可以用作一种近似技术来估计逻辑回归问题的参数。我们首先在一个小的词汇识别任务上验证我们的方法，即字母数字。我们表明，使用GiniSVM可以大大提高训练有素的MMI-HMM系统的性能。我们还发现，有可能得出关于GiniSVM假设的可靠置信度得分，并且这些得分可以在假设组合中发挥良好的作用。然后，我们将根据其可靠识别的能力来分析晶格切割，并为捷克语MALACH域中的错误假设单词（一个庞大的词汇量任务）提供良好的替代方法。我们描述了一种训练和应用SVM来增强首过系统的过程，从而导致了很小但在统计上显着的识别改进。最后，我们讨论了包括聚类在内的方法，以对大型词汇任务进行进一步的改进。

著录项

作者
Venkataramani, Veera.;
展开▼
作者单位

The Johns Hopkins University.;

展开▼
授予单位 The Johns Hopkins University.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2005
页码 123 p.
总页数 123
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Do We Need STRFs for Cocktail Parties? On the Relevance of Physiologically Motivated Features for Human Speech Perception Derived from Automatic Speech Recognition. [J] . B Kollmeier, M R René Sch?dler, A Meyer, Advances in Experimental Medicine and Biology . 2013,第Null期

机译：鸡尾酒会需要STRF吗？生理动机特征与自动语音识别衍生的人类语音感知的相关性。
2. Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition. [J] . Schuster M, Maier A, Haderlein T, International journal of pediatric otorhinolaryngology . 2006,第10期

机译：通过自动语音识别评估唇left裂儿童的语音清晰度。
3. Fractal dimensions of speech sounds: computation and application to automatic speech recognition. [J] . Maragos P, Potamianos A The Journal of the Acoustical Society of America . 1999,第3期

机译：语音的分形维数：自动语音识别的计算和应用。
4. Signal patterns for one or multiple target recognition. Applications to multiple turbo codes and to wireless [C] . Oscar Moreno, Carlos Corrada World multiconference on systemics, cybernetics and informatics;SCI 2000 . 2000

机译：一种或多种目标识别的信号模式。应用于多种turbo码和无线
5. Source and channel coding for speech transmission and remote speech recognition. [D] . Bernard, Alexis Pascal. 2002

机译：用于语音传输和远程语音识别的源和通道编码。
6. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference [O] . Byeongwook Lee, Kwang-Hyun Cho -1

机译：以语音包络作为时间参考的自动语音识别的大脑启发式语音分割
7. Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition. [O] . Philip N. Garner 2013

机译：自动语音识别中的倒谱归一化和信噪比频谱。
8. Multilingual Techniques for Low Resource Automatic Speech Recognition. [R] . Chuangsuwanich, E. 2016

机译：低资源自动语音识别的多语言技巧。

Code breaking for automatic speech recognition.

摘要

著录项

相似文献

相关主题

期刊订阅