This paper treats speech/music discrimination of radio recordings as a maximization task, where the solution is obtained by means of dynamic programming. The proposed method seeks the sequence of segments and respective class labels (i.e., speech/music) that maximize the product of posterior class label probabilities, given the within the segments data. To this end, a Bayesian Network combiner is embedded as a posterior probability estimator. Tests have been performed using a large set of radio recordings with several music genres. The experiments show that the proposed scheme leads to an overall performance of 92.32%. Experiments are also reported on a genre basis and a comparison with existing methods is given.
展开▼