首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >ADVANCES IN UNSUPERVISED AUDIO SEGMENTATION FOR THE BROADCAST NEWS AND NGSW CORPORA
【24h】

ADVANCES IN UNSUPERVISED AUDIO SEGMENTATION FOR THE BROADCAST NEWS AND NGSW CORPORA

机译:广播新闻和NGSW Corpora的无监督音频细分促进

获取原文

摘要

The problem of unsupervised audio segmentation continues to be a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) and Spoken Document Retrieval (SDR) performance. This paper addresses novel advances in audio segmentation for unsupervised multi-speaker change detection. First, we investigate new features which are intended to be more appropriate for segmentation that include:PMVDR (Perceptual Minimum Variance Distortionless Response), SZCR ( Smoothed Zero Crossing Rate), and FBLC (FilterBank Log Coefficients); next we consider a new distance metric, T{sup}2-mean which is intended to improve segmentation for short segments (<5s). A novel false alarm compensation procedure is also developed and used after the segmentation phase. We establish a more effective evaluation procedure for segmentation versus the more traditional EER and Frame Accuracy approaches. Employing these advances within our new scheme, results in more than a 30% improvement in segmentation performance using the 3-hour Hub4 Broadcast news 1997 evaluation data. Evaluations are also presented for audio from the NGSW[13] corpus.
机译:无监督的音频分割问题仍然是一个具有挑战性的研究问题,这显着影响了自动语音识别(ASR)和口头文档检索(SDR)性能。本文涉及无监督多扬声器变更检测的音频分段中的新颖进步。首先,我们调查了新的特征,这些功能旨在更适合包括:PMVDR(感知最小方差无失真响应),SZCR(平滑过零率)和FBLC(FilterBank日志系数);接下来,我们考虑一个新的距离度量,t {sup} 2 - 意思,其旨在改善短段(<5s)的分段。在分割阶段之后也开发并使用了一种新型的误报报警程序。我们为细分建立了更有效的评估程序,而传统的EER和帧精度方法。采用这些进展在我们的新方案中,使用3小时Hub4广播新闻1997年评估数据导致分段绩效提高了30%以上。来自NGSW [13]语料库的音频也呈现评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号