In this paper, we address the problem of monaural music and speech separation, based on soft mask filtering. Likewise other well-known techniques, the estimation of statistical model of the sources are needed. Hence, we employ Vector quantization (VQ) for synthesis stage which results in more accurate codebook entries for each source in contrast to the commonly used GMM (Gaussian Mixture Model) approach. In separation stage we compare the non linear mask proposed in this work with other well-known techniques in terms of undesirable signal to interference ratio (SIR) effects. It is demonstrated that the proposed semi soft mask results in the best performance in terms of both SIR and subjective measures.
展开▼