首页> 外文期刊>IEEE Transactions on Games >“Are You Playing a Shooter Again?!” Deep Representation Learning for Audio-Based Video Game Genre Recognition
【24h】

“Are You Playing a Shooter Again?!” Deep Representation Learning for Audio-Based Video Game Genre Recognition

机译:“你再次拍打射手吗?!”基于音频视频游戏类型识别的深度代表学习

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we present a novel computer audition task: audio-based video game genre classification. The aim of this study is threefold: 1) to check the feasibility of the proposed task; 2) to introduce a new corpus: The Game Genre by Audio + Multimodal Extracts (G(2) AME), collected entirely from social multimedia; and 3) to compare the efficacy of various acoustic feature spaces to classify the G(2) AME corpus into six game genres using a linear support vector machine classifier. For the classification we extract three different feature representations from the game audio files: 1) Knowledge-based acoustic features; 2) Deep Spectrum features; and 3) quantized Deep Spectrum features using Bag-of-Audio-Words. The Deep Spectrum features are a deep-learning-based representation derived from forwarding the visual representations of the audio instances, in particular spectrograms, mel-spectrograms, chromagrams, and their deltas through deep task-independent pretrained CNNs. Specifically, activations of fully connected layers from three common image classification CNNs, GoogLeNet, AlexNet, and VGG16 are used as feature vectors. Results for the six-genre classification problem indicate the suitability of our deep learning approach for this task. Our best method achieves an accuracy of up to 66.9% unweighted average recall using tenfold cross-validation.
机译:在本文中,我们提出了一种新颖的计算机试听任务:基于音频的视频游戏类型分类。本研究的目的是三倍:1)检查拟议任务的可行性; 2)介绍一个新的语料库:音频+多模式提取物的游戏类型(G(2)ame),完全来自社交多媒体; 3)比较各种声学特征空间的功效将G(2)ame语料库分类为使用线性支持向量机分类器将G(2)ame语料库分为六个游戏。对于分类,我们从游戏音频文件中提取三个不同的特征表示:1)基于知识的声学功能; 2)深度谱特征; 3)使用音频字袋来量化的深度频谱特征。深度频谱特征是一种基于深度学习的表示,通过深度任务独立的预制CNNS转发音频实例的视觉表示,特别是频谱图,熔点,Chromagrams及其Δ。具体地,从三个公共图像分类CNNS,Googlenet,AlexNet和VGG16激活完全连接的层作为特征向量。六种分类问题的结果表明我们对此任务的深度学习方法的适用性。我们最好的方法使用十倍交叉验证实现了高达66.9%的未加权平均召回的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号