首页> 外文会议>International Symposium on Computer Music Modeling and Retrieval >Auditory Sketches: Sparse Representations of Sounds Based on Perceptual Models
【24h】

Auditory Sketches: Sparse Representations of Sounds Based on Perceptual Models

机译:听觉草图:基于感知模型的声音稀疏表示

获取原文

摘要

An important question for both signal processing and auditory science is to understand which features of a sound carry the most important information for the listener. Here we approach the issue by introducing the idea of "auditory sketches": sparse representations of sounds, severely impoverished compared to the original, which nevertheless afford good performance on a given perceptual task. Starting from biologically-grounded representations (auditory models), a sketch is obtained by reconstructing a highly under-sampled selection of elementary atoms. Then, the sketch is evaluated with a psychophysical experiment involving human listeners. The process can be repeated iteratively. As a proof of concept, we present data for an emotion recognition task with short non-verbal sounds. We investigate 1/ the type of auditory representation that can be used for sketches 2/ the selection procedure to sparsify such representations 3/ the smallest number of atoms that can be kept 4/ the robustness to noise. Results indicate that it is possible to produce recognizable sketches with a very small number of atoms per second. Furthermore, at least in our experimental setup, a simple and fast under-sampling method based on selecting local maxima of the representation seems to perform as well or better than a more traditional algorithm aimed at minimizing the reconstruction error. Thus, auditory sketches may be a useful tool for choosing sparse dictionaries, and also for identifying the minimal set of features required in a specific perceptual task.
机译:信号处理和听觉科学的一个重要问题是要了解声音的哪个功能为侦听器提供最重要的信息。在这里,我们通过引入“听觉草图”的想法来解决问题:声音的稀疏表示,与原件相比严重贫困,但在给定的感知任务上提供了良好的表现。从生物接地的表示(听觉模型)开始,通过重建高度取样的基本原子选择来获得草图。然后,用涉及人类听众的心理物理实验评估草图。可以迭代地重复该过程。作为概念证明,我们为情感识别任务提供了短的非口头声音的数据。我们调查1 /可用于草图2 /选择过程的听觉表示的类型,以缩小这种表示的3 /最小的原子数,可以将4 /噪声稳健。结果表明,可以产生识别的草图,每秒具有非常少量的原子。此外,至少在我们的实验设置中,基于选择局部最大值的简单且快速的下式采样方法似乎也表现不佳或更好,而不是更传统的算法,该算法旨在最小化重建误差。因此,听觉草图可以是用于选择稀疏词典的有用工具,并且还用于识别特定感知任务所需的最小特征集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号