首页> 外文会议>2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics >Learning emotion-based acoustic features with deep belief networks
【24h】

Learning emotion-based acoustic features with deep belief networks

机译:使用深层信念网络学习基于情感的声学特征

获取原文

摘要

The medium of music has evolved specifically for the expression of emotions, and it is natural for us to organize music in terms of its emotional associations. But while such organization is a natural process for humans, quantifying it empirically proves to be a very difficult task, and as such no dominant feature representation for music emotion recognition has yet emerged. Much of the difficulty in developing emotion-based features is the ambiguity of the ground-truth. Even using the smallest time window, opinions on the emotion are bound to vary and reflect some disagreement between listeners. In previous work, we have modeled human response labels to music in the arousal-valence (A-V) representation of affect as a time-varying, stochastic distribution. Current methods for automatic detection of emotion in music seek performance increases by combining several feature domains (e.g. loudness, timbre, harmony, rhythm). Such work has focused largely in dimensionality reduction for minor classification performance gains, but has provided little insight into the relationship between audio and emotional associations. In this new work we seek to employ regression-based deep belief networks to learn features directly from magnitude spectra. While the system is applied to the specific problem of music emotion recognition, it could be easily applied to any regression-based audio feature learning problem.
机译:音乐的媒介已经专门为表达情感而发展,对于我们来说,根据音乐的情感联想来组织音乐是很自然的。但是,尽管这种组织对人类来说是自然的过程,但从经验上对其进行量化证明是一项艰巨的任务,因此,尚未出现用于音乐情感识别的主导特征表示。开发基于情感的功能的许多困难是事实真相的模棱两可。即使使用最小的时间窗口,对情感的看法也必然会发生变化,并反映出听众之间的某些分歧。在以前的工作中,我们以情感的唤醒价(A-V)表示作为时变,随机分布的人类对音乐的响应标签。通过组合多个特征域(例如响度,音色,和声,节奏),用于自动检测音乐中的情感的当前方法提高了演奏性能。此类工作主要集中在降维上,以获得较小的分类性能,但对音频和情感联想之间的关系了解甚少。在这项新工作中,我们力求采用基于回归的深度置信网络直接从幅度谱中学习特征。当该系统应用于音乐情感识别的特定问题时,它可以轻松地应用于任何基于回归的音频特征学习问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号