首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >Towards a more efficient sparse coding based audio-word feature extraction system
【24h】

Towards a more efficient sparse coding based audio-word feature extraction system

机译:寻求一种更有效的基于稀疏编码的音频词特征提取系统

获取原文

摘要

This paper is concerned with the efficiency of sparse coding based audio-word feature extraction system. In particular, we have defined and added the concept of early and late temporal pooling to the classic sparse coding based audio-word feature extraction pipeline, and we have tested them on the genre tags subset of the CAL10k data set. We define temporal pooling as any functions that are able to transforms the input time series representation into a more temporally compact representation. Under this definition, we have examined the following two temporal pooling functions for improving the feature extraction's efficiency, and they are: Early Texture Window Pooling and Multiple Frame Representation. Early texture window pooling tremendously boost the efficiency by compromising the retrieving accuracy, while multiple frame representation slightly improve both the feature extracting efficiency and retrieving accuracy. Overall, our best feature extraction setup achieves 0.202 in mean average precision on the genre tags subset of the CAL10k data set.
机译:本文涉及基于稀疏编码的音频 - 字特征提取系统的效率。特别是,我们已经定义并添加了早期和晚期时间汇总的概念,到基于经典的稀疏编码的音频 - 字特征提取管道,我们已经在CAL10K数据集的类型标签上测试了它们。我们将时间池定义为能够将输入时间序列表示转换为更短时间的函数的任何功能。在此定义下,我们已经检查了以下两个时间汇总功能,以提高特征提取的效率,它们是:早期纹理窗口池和多帧表示。早期纹理窗口池通过损害检索精度来极大地提高效率,而多个帧表示略微改善特征提取效率和检索精度。总的来说,我们最佳的特征提取设置在CAL10K数据集的类型标签子集中实现了0.202的平均平均精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号