To extract the content of audio documents, the first step in many approaches is to segment the signal in primary components, such as music and speech. Very few attention has been brought to the detection of the singing voice. In this paper, we propose simple parameters (vibrato and harmonic coefficient) and an original segmentation based on a sinusoidal segmentation to characterize the singing voice. This information is then mixed with those issued from a speech/music decomposition. We test this classification system on a database composed of various types of sound. We first test our system in a classification task, then in a detection task. In both cases, the results are good. In our classification system, the only mis-classifications are due to very rare musical styles. In the detection task, our system misses some of the singing voice segments, but we observe very few false-alarm.
展开▼