Singing Voice Separation by Low-Rank and Sparse Spectrogram Decomposition with Pre-learned Dictionaries

SHIWEI YU; HONGJUAN ZHANG; ZHIYAO DUAN

首页> 外文期刊>Journal of the Audio Engineering Society >Singing Voice Separation by Low-Rank and Sparse Spectrogram Decomposition with Pre-learned Dictionaries

【24h】

Singing Voice Separation by Low-Rank and Sparse Spectrogram Decomposition with Pre-learned Dictionaries

机译：用预先学习的词典拼接低级和稀疏频谱图分解的语音分离

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Unsupervised spectrogram decomposition has shown promising results for singing voice separation in recent years. Its basic idea is to decompose the mixture spectrogram into a sparse spectrogram for the singing voice and a low-rank spectrogram for the background music. This approach, however, has two limitations. First, the unsupervised nature prevents the pre-learning of voice and background music dictionaries from widely available, although not related to the song being separated, isolated singing voice and background music recordings. Second, some components of the singing voice (e.g., fricatives) and the background music (e.g., less repetitive background) may not show the preferred sparse and low-rank properties, respectively. In this paper we propose to decompose the mixture spectrogram into three parts: a sparse spectrogram representing the singing voice, a low-rank spectrogram representing the background music, and a residual spectrogram for the components that are not identified by either the sparse or the low-rank spectrogram. Besides, we learn universal voice and music dictionaries from isolated singing voice and background music training data. Finally, we propose an approach named Low-rank and Sparse representation with Pre-learned Dictionaries under the framework of Alternating Direction Method of Multiplier. Experiments on the MIR-1K dataset and iKala dataset show its better performance.

机译：无监督的谱图分解显示近年来歌唱语音分离的有希望的结果。其基本思想是将混合谱图分解为唱歌语音的稀疏频谱图和背景音乐的低秩谱图。然而，这种方法有两个限制。首先，无监督的性质可以防止语音和背景音乐词典的预测从广泛使用，虽然与被分开的歌曲无关，隔离歌唱语音和背景乐谱。其次，歌唱语音的一些组件（例如，摩擦）和背景音乐（例如，较少的重复背景）分别不显示优选的稀疏和低秩属性。在本文中，我们建议将混合体谱图分解为三个部分：表示唱歌语音的稀疏频谱图，表示背景音乐的低级谱图，以及由稀疏或低的组件的组件的残余频谱图-rank频谱图。此外，我们学习来自孤立的歌唱语音和背景音乐训练数据的通用语音和音乐词典。最后，我们提出了一种在乘法器的交替方向方法的框架下命名为低级别和稀疏表示的方法。 MIR-1K DataSet和Ikala DataSet上的实验显示了更好的性能。

著录项

来源
《Journal of the Audio Engineering Society》 |2017年第5期|377-388|共12页
作者
SHIWEI YU; HONGJUAN ZHANG; ZHIYAO DUAN;
展开▼
作者单位

Department of Mathematics Shanghai University Shanghai 200444 P R China;

Department of Mathematics Shanghai University Shanghai 200444 P R China;

Department of Electrical and Computer Engineering University of Rochester Rochester NY 14627 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Low-Rank Sparse Representation with Pre-Learned Dictionaries and Side Information for Singing Voice Separation [J] . Chenghong Yang, Hongjuan Zhang Advances in Pure Mathematics . 2018,第4期

机译：低阶稀疏表示与预先学习的词典和附带信息一起唱歌进行语音分离
2. Singing voice separation with pre-learned dictionary and reconstructed voice spectrogram [J] . Neural computing & applications . 2020,第8期

机译：使用预先学习的字典和重建语音谱图来唱歌语音分离
3. Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms [J] . Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第2期

机译：基于多通道幅度谱的贝叶斯低秩和稀疏分解的语音增强
4. Speech enhancement by sparse, low-rank, and dictionary spectrogram decomposition [C] . Chen Zhuo, Ellis Daniel P.W. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics . 2013

机译：通过稀疏，低秩和字典频谱图分解来增强语音
5. On Computing Sparse Generalized Inverses and Sparse-Inverse/Low-Rank Decompositions [D] . ?Fuentes, Victor K. 2019

机译：计算稀疏概括的逆和稀疏 - 逆/低秩分解
6. Low-rank and Sparse Matrix Decomposition for Accelerated Dynamic MRI with Separation of Background and Dynamic Components [O] . Ricardo Otazo, Emmanuel Candès, Daniel K. Sodickson -1

机译：具有背景和动态成分分离的加速动态MRI的低秩和稀疏矩阵分解
7. SPEECH ENHANCEMENT BY SPARSE, LOW-RANK, AND DICTIONARY SPECTROGRAM DECOMPOSITION [O] . Zhuo Chen, Daniel P. W. Ellis 2014

机译：通过稀疏，低阶和字典式光谱分解增强语音

Singing Voice Separation by Low-Rank and Sparse Spectrogram Decomposition with Pre-learned Dictionaries

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅