SVMs for Automatic Speech Recognition: A Survey

机译：支持自动语音识别的SVM：一项调查

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research.

机译：隐马尔可夫模型（HMM）无疑是最常用的自动语音识别（ASR）核心技术。尽管如此，我们离实现高性能ASR系统还有很长的路要走。在80年代末和90年代初，提出了一些替代方法，其中大多数基于人工神经网络（ANN）。他们中的一些人使用预测性ANN解决了ASR问题，而其他人则提出了混合HMM / ANN系统。然而，尽管取得了一些成就，但如今，马尔可夫模型占了上风。但是，在过去的十年中，一种新工具出现在机器学习领域，事实证明它能够解决以下几个应用领域中的硬分类问题：支持向量机（SVM）。 SVM是有效的区分性分类器，具有几个突出的特征，即：它们的解决方案是具有最大的余量；他们有能力处理更高维度的样本；并确保它们收敛到相关成本函数的最小值。这些特性使SVM变得非常流行和成功。在本章中，我们将讨论它们在ASR环境中的优缺点，并对当前的最新技术进行回顾。我们将贡献分为两个部分：隔离词识别和连续语音识别。在第一部分中，我们回顾了几种产生原始SVM所需的固定尺寸矢量的技术。之后，我们将基于能够处理不同长度序列的内核探索更复杂的技术。其中有一个简单而有效的DTAK内核，它可以挽救一种古老的语音识别技术：动态时间规整（DTW）。在第二部分中，我们描述了一些新方法来解决更复杂的任务，例如使用SVM进行连接数字识别或连续语音识别。最后，我们得出一些结论并概述了一些正在进行的研究领域。

著录项

来源
《Progress in Nonlinear Speech Processing; Lecture Notes in Computer Science; 4391》|2005年|190-216|共27页
会议地点 Crete(GR)
作者
R. Solera-Urena; J. Padrell-Sendra; D. Martin-Iglesias; A. Gallardo-Antolin; C. Pelaez-Moreno; F. Diaz-de-Maria;
展开▼
作者单位

Signal Theory and Communications Department EPS-Universidad Carlos III de Madrid Avda. De la Universidad, 30, 28911-Leganes (Madrid), Spain;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Structured SVMs for Automatic Speech Recognition [J] . Zhang S.-X., Gales M. J. F. Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第3期

机译：用于自动语音识别的结构化SVM
2. Automatic speech patterns recognition of commands using SVM and PSO [J] . Batista Gracieth Cavalcanti, Silva Washington Luis Santos, de Oliveira Duarte Lopes, Multimedia Tools and Applications . 2019,第22期

机译：使用SVM和PSO自动识别命令的语音模式
3. Design of a real time automatic speech recognition system using Modified One Against All SVM classifier [J] . J. Manikandan, B. Venkataramani Microprocessors and microsystems . 2011,第6期

机译：改进的对抗所有SVM分类器的实时自动语音识别系统设计
4. Automatic Speech Emotion Recognition Using Auditory Models with Binary Decision Tree and SVM [C] . Yuncu Enes, Hacihabiboglu Huseyin, Bozsahin Cem International Conference on Pattern Recognition . 2014

机译：使用带有二进制决策树和SVM的听觉模型的自动语音情感识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Automatic speech recognition in the operating room – An essential contemporary tool or a redundant gadget? A survey evaluation among physicians in form of a qualitative study [O] . Antonia Schulte, Rodrigo Suarez-Ibarrola, Daniel Wegen, 2020

机译：手术室自动语音识别 - 必不可少的当代工具或冗余小工具？质量研究形式的医生调查评估
7. SVMs for Automatic Speech Recognition: a Survey [O] . Solera Ureña R., Padrell Sendra J., Martín Iglesias D., 2007

机译：用于自动语音识别的sVm：一项调查

SVMs for Automatic Speech Recognition: A Survey

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅