ADVANCES IN UNSUPERVISED AUDIO SEGMENTATION FOR THE BROADCAST NEWS AND NGSW CORPORA

机译：广播新闻和NGSW Corpora的无监督音频细分促进

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The problem of unsupervised audio segmentation continues to be a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) and Spoken Document Retrieval (SDR) performance. This paper addresses novel advances in audio segmentation for unsupervised multi-speaker change detection. First, we investigate new features which are intended to be more appropriate for segmentation that include:PMVDR (Perceptual Minimum Variance Distortionless Response), SZCR ( Smoothed Zero Crossing Rate), and FBLC (FilterBank Log Coefficients); next we consider a new distance metric, T{sup}2-mean which is intended to improve segmentation for short segments (<5s). A novel false alarm compensation procedure is also developed and used after the segmentation phase. We establish a more effective evaluation procedure for segmentation versus the more traditional EER and Frame Accuracy approaches. Employing these advances within our new scheme, results in more than a 30% improvement in segmentation performance using the 3-hour Hub4 Broadcast news 1997 evaluation data. Evaluations are also presented for audio from the NGSW[13] corpus.

机译：无监督的音频分割问题仍然是一个具有挑战性的研究问题，这显着影响了自动语音识别（ASR）和口头文档检索（SDR）性能。本文涉及无监督多扬声器变更检测的音频分段中的新颖进步。首先，我们调查了新的特征，这些功能旨在更适合包括：PMVDR（感知最小方差无失真响应），SZCR（平滑过零率）和FBLC（FilterBank日志系数）;接下来，我们考虑一个新的距离度量，t {sup} 2 - 意思，其旨在改善短段（<5s）的分段。在分割阶段之后也开发并使用了一种新型的误报报警程序。我们为细分建立了更有效的评估程序，而传统的EER和帧精度方法。采用这些进展在我们的新方案中，使用3小时Hub4广播新闻1997年评估数据导致分段绩效提高了30％以上。来自NGSW [13]语料库的音频也呈现评估。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2004年||共4页
会议地点
作者
Rongqing Huang; John H. L Hansen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora [J] . Rongqing Huang, Hansen J.H.L. IEEE transactions on audio, speech and language processing . 2006,第3期

机译：广播新闻和NGSW语料库的无监督音频分类和分段的进展
2. Audio segmentation-by-classification approach based on factor analysis in broadcast news domain [J] . Diego Castán, Alfonso Ortega, Antonio Miguel, EURASIP journal on audio, speech, and music processing . 2014,第1期

机译：广播新闻领域中基于因子分析的音频分类方法
3. Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion [J] . Taras Butko, Climent Nadeu EURASIP journal on audio, speech, and music processing . 2011,第1期

机译：Albayzin-2010评估中广播新闻的音频分段：概述，结果和讨论
4. ADVANCES IN UNSUPERVISED AUDIO SEGMENTATION FOR THE BROADCAST NEWS AND NGSW CORPORA [C] . Rongqing Huang, John H. L Hansen IEEE International Conference on Acoustics, Speech, and Signal Processing . 2004

机译：广播新闻和NGSW Corpora的无监督音频细分促进
5. Fake news? A survey on video news releases and their implications on journalistic ethics, integrity, independence, professionalism, credibility, and commercialization of broadcast news. [D] . Clark, Chandra Smallwood. 2009

机译：假新闻？有关视频新闻发布及其对新闻道德，诚信，独立性，专业性，信誉和广播新闻商业化的影响的调查。
6. Original research: Quantifying alcohol audio-visual content in UK broadcasts of the 2018 Formula 1 Championship: a content analysis and population exposure [O] . Alexander Barker, Magdalena Opazo-Breton, Emily Thomson, 2020

机译：原始研究：量化2018级锦标赛英国广播中的酒精视听内容：内容分析和人口曝光
7. UNSUPERVISED BROADCAST NEWS STORY SEGMENTATION USING DISTANCE DEPENDENT CHINESE RESTAURANT PROCESSES [O] . Chao Yang, Lei Xie, Xiangzeng Zhou 2015

机译：使用远程相关的中国餐馆过程进行无监管的广播新闻故事分割

ADVANCES IN UNSUPERVISED AUDIO SEGMENTATION FOR THE BROADCAST NEWS AND NGSW CORPORA

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅