【24h】

SVMs for Automatic Speech Recognition: A Survey

机译:支持自动语音识别的SVM:一项调查

获取原文
获取原文并翻译 | 示例

摘要

Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research.
机译:隐马尔可夫模型(HMM)无疑是最常用的自动语音识别(ASR)核心技术。尽管如此,我们离实现高性能ASR系统还有很长的路要走。在80年代末和90年代初,提出了一些替代方法,其中大多数基于人工神经网络(ANN)。他们中的一些人使用预测性ANN解决了ASR问题,而其他人则提出了混合HMM / ANN系统。然而,尽管取得了一些成就,但如今,马尔可夫模型占了上风。但是,在过去的十年中,一种新工具出现在机器学习领域,事实证明它能够解决以下几个应用领域中的硬分类问题:支持向量机(SVM)。 SVM是有效的区分性分类器,具有几个突出的特征,即:它们的解决方案是具有最大的余量;他们有能力处理更高维度的样本;并确保它们收敛到相关成本函数的最小值。这些特性使SVM变得非常流行和成功。在本章中,我们将讨论它们在ASR环境中的优缺点,并对当前的最新技术进行回顾。我们将贡献分为两个部分:隔离词识别和连续语音识别。在第一部分中,我们回顾了几种产生原始SVM所需的固定尺寸矢量的技术。之后,我们将基于能够处理不同长度序列的内核探索更复杂的技术。其中有一个简单而有效的DTAK内核,它可以挽救一种古老的语音识别技术:动态时间规整(DTW)。在第二部分中,我们描述了一些新方法来解决更复杂的任务,例如使用SVM进行连接数字识别或连续语音识别。最后,我们得出一些结论并概述了一些正在进行的研究领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号