首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Audio-Based Semantic Concept Classification for Consumer Video
【24h】

Audio-Based Semantic Concept Classification for Consumer Video

机译:消费者视频的基于音频的语义概念分类

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips.
机译:本文提出了一种新颖的方法,可根据消费者的视频剪辑自动分类消费者的视频剪辑。我们选择了25种重叠的语义类,以其对用户的实用性,自动检测和注释者标记的可行性以及可用视频集中表示的充分性来进行选择。这些概念为来自真实用户的1873个视频集添加了注释。从每个视频剪辑的基本表示作为一系列的mel-频率倒谱系数(MFCC)帧开始,我们尝试三种剪辑级别的表示:单高斯建模,高斯混合建模以及高斯分量直方图的概率潜在语义分析。使用此类摘要功能,我们基于Kullback-Leibler,Bhattacharyya或Mahalanobis距离测度产生支持向量机(SVM)分类器。定量评估表明,我们的方法可有效地检测大量现实世界的消费者视频剪辑中的有趣概念。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号