首页> 外文OA文献 >Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modeling
【2h】

Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modeling

机译:通用音频建模的隐马尔可夫模型的选择,参数估计和判别训练

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Hidden Markov models (HMMs) permit a natural and flexible way to model time-sequential data. The ease of concatenation and time-warping algorithms implementation on HMMs suit them very well for segmentation and content based audio classification applications, as is clear from their extended and successful use on speech recognition applications. Speech has a natural basic unit, the phone, which normally delimits the number of models to one per phone. Moreover, knowledge of the speech structure facilitates the choice of the model parameters. When modeling generic audio, on other hand, the lack of a natural basic unit, and the absence of a clear structure, make the selection and the parameter estimation of an optimal set of HMMs difficult. In this paper we present different approaches to select and estimate the HMM parameters of a set of representative generic audio classes. We compare these approaches in the context of a content- based classification application using the MuscleFish database. The models are first found through frame clustering or by traditional EM techniques under some specific selection criteria, such as the Bayesian information criterion. Further discriminative training of the initial models considerably improve their performance in the content-based classification task, obtaining results comparable with the ones obtained, for the same task, by inherently discriminative classification methods, such as support vector machines, while preserving the intrinsic flexibility of HMMs.
机译:隐马尔可夫模型(HMM)提供了一种自然而灵活的方式来对时序数据进行建模。从HMM的级联和时变算法的易用性使其非常适合分段和基于内容的音频分类应用程序,从它们在语音识别应用程序中的扩展和成功使用可以清楚地看出。语音具有自然的基本单位,即电话,通常将每台电话的型号限制为一个。此外,语音结构的知识有助于模型参数的选择。另一方面,当对通用音频进行建模时,缺少自然的基本单位,也缺少清晰的结构,使得对HMM的最佳集合的选择和参数估计变得困难。在本文中,我们提出了不同的方法来选择和估计一组代表性的通用音频类的HMM参数。我们在使用MuscleFish数据库的基于内容的分类应用程序的上下文中比较了这些方法。首先通过框架聚类或通过传统的EM技术在某些特定选择标准(例如贝叶斯信息标准)下找到模型。对初始模型的进一步判别训练大大提高了它们在基于内容的分类任务中的性能,获得的结果与通过相同的判别分类方法(如支持向量机)针对同一任务所获得的结果相当,同时保留了内在的灵活性。 HMM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号