首页> 外文会议>International Symposium INFOTEH-JAHORINA >Audio Signal Mapping into Spectrogram-Based Images for Deep Learning Applications
【24h】

Audio Signal Mapping into Spectrogram-Based Images for Deep Learning Applications

机译:音频信号映射到基于谱的基于频谱学习应用程序的图像

获取原文

摘要

Various features generated from raw audio signals can be used as an input of a deep learning model. They include hand-crafted features such as mel-frequency cepstral coefficients, two-dimensional time-frequency representations and raw audio data. In most cases, the time-frequency representations are related to so-called spectrogram-based images. Having an image at the deep learning input enables to apply performance improvement accumulated in video and image processing. However, spectrogram-based images have some specific properties that should be taken into account when a deep learning model is designed. This paper deals with mapping of audio signals into the most common spectrogram-based images. Some unique properties of these images as well as the way how they are generated are analyzed here for a particular case of fridge sounds.
机译:从原始音频信号产生的各种特征可以用作深度学习模型的输入。 它们包括诸如熔融频率谱系齐数,二维时频表示和原始音频数据的手工制作的特征。 在大多数情况下,时频表示与所谓的基于频谱图的图像有关。 在深度学习输入处具有图像,可以应用累积在视频和图像处理中的性能改进。 然而,基于频谱图的图像具有一些特定的属性,当设计深度学习模型时应考虑。 本文涉及音频信号映射到最常见的基于频谱图的图像。 这些图像的一些独特属性以及如何在此处分析它们的方式,以针对冰箱声音的特定情况进行分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号