首页> 外文会议>International Conference on Artificial Intelligence in Information and Communication >Multi-Channel Audio Source Separation Using Azimuth-Frequency Analysis and Convolutional Neural Network
【24h】

Multi-Channel Audio Source Separation Using Azimuth-Frequency Analysis and Convolutional Neural Network

机译:使用方位频分析和卷积神经网络进行多通道音频源分离

获取原文

摘要

Since MPEG-H supports not only channel-based but also object-based audio content, there is a need for a sound source separation technique that converts channel-based to object-based audio. Among the various sound source separation techniques, azimuth-frequency (AF) based sound source separation has been proposed for converting channel-based audio to object-based audio. Unfortunately, it is difficult to set the optimal azimuth and width using this technique. In this paper, we propose a method to determine the optimal azimuth and width based on a convolutional neural network (CNN) classifier. First, depending on numerous azimuths and widths, different sets of audio signals are separated. After that, each audio set is categorized into a specific audio class using the CNN classifier. Then, in order to separate a desired audio signal, the azimuth and width with the highest similarity for a given class are selected. The performance of the CNN classifier is evaluated in terms of separation accuracy and objective measures such as signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifacts ratio (SAR). Consequently, the proposed method provides higher SDR, SAR, SIR, and separation accuracy than a minimum variance distortionless response (MVDR) beamformer as well as a method that only uses AF analysis.
机译:由于MPEG-H不仅支持基于频道而且基于对象的音频内容,因此需要一种声源分离技术,其将基于对象的音频转换为基于对象的音频。在各种声源分离技术中,已经提出了基于方位频(AF)的声源分离,用于将基于信道的音频转换为基于对象的音频。不幸的是,很难使用这种技术设置最佳方位角和宽度。在本文中,我们提出了一种基于卷积神经网络(CNN)分类器的最佳方位角和宽度的方法。首先,取决于许多方位角和宽度,分离不同的音频信号。之后,使用CNN分类器将每个音频集分类为特定的音频类。然后,为了分离所需的音频信号,选择具有对给定类的最高相似性的方位角和宽度。根据分离精度和客观度量评估CNN分类器的性能,例如信令对失真率(SDR),信号到干扰比(SIR)和信号到伪像比(SAR)。因此,所提出的方法提供更高的SDR,SAR,SAR和分离精度,而不是最小方差失真响应(MVDR)波束形成器以及仅使用AF分析的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号