首页> 外文OA文献 >Filterbank coefficients selection for segmentation in singer turns
【2h】

Filterbank coefficients selection for segmentation in singer turns

机译:歌手转弯中用于细分的Filterbank系数选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Audio segmentation is often the first step of audio indexing systems. It provides segments supposed to be acoustically homogeneous. In this paper, we report our recent experiments on segmenting music recordings into singer turns, by analogy with speaker turns in speech processing. We compare several acoustic features for this task: FilterBANK coefficients (FBANK), and Mel frequency cepstral coefficients (MFCC). FBANK features were shown to outperform MFCC on a “clean” singing corpus. We describe a coefficient selection method that allowed further improvement on this corpus. A 75.8% F-measure was obtained with FBANK features selected with this method, corresponding to a 30.6% absolute gain compared to MFCC. On another corpus comprised of ethno-musicological recordings, both feature types showed a similar performance of about 60%. This corpus presents an increased difficulty due to the presence of instruments overlapped with singing and to a lower recording audio quality.
机译:音频分段通常是音频索引系统的第一步。它提供了在声学上均一的片段。在本文中,我们报告了我们最近的实验,通过类似于语音处理中的说话者转身,将音乐录音分割成歌手转身。我们比较了此任务的几种声学特征:FilterBANK系数(FBANK)和Mel频率倒谱系数(MFCC)。 FBANK功能在“干净”的演唱语料库中表现优于MFCC。我们描述了一种系数选择方法,可以进一步改善该语料库。使用此方法选择的FBANK功能可获得75.8%的F量度,与MFCC相比,绝对增益为30.6%。在另一个由民族音乐唱片组成的语料库上,两种特征类型都表现出相似的性能,约为60%。由于存在与唱歌重叠的乐器以及较低的录制音频质量,因此该语料库的难度增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号