首页> 外文会议>INTERSPEECH 2012 >Combining frame and segment based models for environmental sound classification
【24h】

Combining frame and segment based models for environmental sound classification

机译:基于帧和段的环境声分类模型

获取原文

摘要

The paper considers the task of recognizing environmental sounds, which plays a critical role in human's perception of an auditory context in audiovisual materials. A variety of features have been proposed for audio recognition, either frame-based or segmental. Here, we propose a two-stage framework to combine modeling in these two levels. First, the Gaussian Mixture Models(GMMs) are built based on short-term features and preclassification are performed. Then, in the event that the GMMs are not certain about the result, the system engages Support Vector Machines (SVMs) to refine the output hypothesis. In the next stage, the features are combined by taking posterior estimates of GMMs along with segmental features as SVMs' input features. Experiments on the sound dataset show that the proposed framework makes an improvement over the traditional methods.
机译:本文考虑了承认环境声音的任务,这在人类对视听环境中的感知中发挥着关键作用。已经提出了各种特征来进行音频识别,无论是基于帧的还是分段。在这里,我们提出了一个两级框架,以在这两个层面中结合建模。首先,高斯混合模型(GMMS)是基于短期特征的构建,并进行预分散。然后,在GMMS不确定结果的情况下,系统接合支持向量机(SVM)以优化输出假设。在下一个阶段,通过将GMM的后验估计以及SVMS输入特征的分段特征以及SVMS的输入特征来组合该特征。声音数据集的实验表明,所提出的框架通过传统方法改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号