首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Large Margin Discriminative Semi-Markov Model for Phonetic Recognition
【24h】

Large Margin Discriminative Semi-Markov Model for Phonetic Recognition

机译:大边际判别半马尔可夫模型的语音识别

获取原文
获取原文并翻译 | 示例

摘要

This paper considers a large margin discriminative semi-Markov model (LMSMM) for phonetic recognition. The hidden Markov model (HMM) framework that is often used for phonetic recognition assumes only local statistical dependencies between adjacent observations, and it is used to predict a label for each observation without explicit phone segmentation. On the other hand, the semi-Markov model (SMM) framework allows simultaneous segmentation and labeling of sequential data based on a segment-based Markovian structure that assumes statistical dependencies among all the observations within a phone segment. For phonetic recognition which is inherently a joint segmentation and labeling problem, the SMM framework has the potential to perform better than the HMM framework at the expense of slight increase in computational complexity. The SMM framework considered in this paper is based on a non-probabilistic discriminant function that is linear in the joint feature map which attempts to capture long-range statistical dependencies among observations. The parameters of the discriminant function are estimated by a large margin learning framework for structured prediction. The parameter estimation problem in hand leads to an optimization problem with many margin constraints, and this constrained optimization problem is solved using a stochastic gradient descent algorithm. The proposed LMSMM outperformed the large margin discriminative HMM in the TIMIT phonetic recognition task.
机译:本文考虑了用于语音识别的大边际判别半马尔可夫模型(LMSMM)。经常用于语音识别的隐马尔可夫模型(HMM)框架仅假设相邻观测值之间的局部统计依赖性,并且用于预测每个观测值的标签而无需进行明确的电话细分。另一方面,半马尔可夫模型(SMM)框架允许基于分段的马尔可夫结构同时对顺序数据进行分段和标记,该结构假定电话分段内所有观测值之间具有统计依赖性。对于固有地是联合分割和标记问题的语音识别,SMM框架具有比HMM框架更好的表现,但其代价是计算复杂性略有增加。本文考虑的SMM框架基于联合特征图中呈线性的非概率判别函数,该函数试图捕获观测值之间的长期统计依存关系。判别函数的参数由用于结构化预测的大余量学习框架估算。手中的参数估计问题导致具有许多裕量约束的优化问题,并且使用随机梯度下降算法解决了该约束优化问题。在TIMIT语音识别任务中,拟议的LMSMM优于大幅度判别HMM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号