首页> 外文会议>Exploring music contents >Notes on Nonnegative Tensor Factorization of the Spectrogram for Audio Source Separation: Statistical Insights and Towards Self-Clustering of the Spatial Cues
【24h】

Notes on Nonnegative Tensor Factorization of the Spectrogram for Audio Source Separation: Statistical Insights and Towards Self-Clustering of the Spatial Cues

机译:关于声源分离的频谱图的非负张量因式分解的注意事项:空间线索的统计见解和走向自我聚类

获取原文
获取原文并翻译 | 示例

摘要

Nonnegative tensor factorization (NTF) of multichannel spectrograms under PARAFAC structure has recently been proposed by Fitzgerald et al as a mean of performing blind source separation (BSS) of multichannel audio data. In this paper we investigate the statistical source models implied by this approach. We show that it implicitly assumes a nonpoint-source model contrasting with usual BSS assumptions and we clarify the links between the measure of flt chosen for the NTF and the implied statistical distribution of the sources. While the original approach of Fitzgeral et al requires a posterior clustering of the spatial cues to group the NTF components into sources, we discuss means of performing the clustering within the factorization. In the results section we test the impact of the simplifying nonpoint-source assumption on underdetermined linear instantaneous mixtures of musical sources and discuss the limits of the approach for such mixtures.
机译:Fitzgerald等人最近提出了PARAFAC结构下的多通道频谱图的非负张量分解(NTF),作为对多通道音频数据进行盲源分离(BSS)的一种手段。在本文中,我们研究了此方法隐含的统计源模型。我们表明,它隐含地假设了一个非点源模型,与通常的BSS假设形成对比,并且阐明了为NTF选择的flt度量与源的隐含统计分布之间的联系。尽管Fitzgeral等人的原始方法要求对空间线索进行后验聚类,以将NTF分量分组为源,但我们讨论了在分解中进行聚类的方法。在结果部分中,我们测试简化的非点源假设对不确定的线性音源瞬时混合的影响,并讨论这种混合方法的局限性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号