首页> 外文会议>International Conference on Information and Communication Technologies >Development and Evaluation of Automatic -Speaker based- Audio Identification and Segmentation for Broadcast News Recordings Indexation
【24h】

Development and Evaluation of Automatic -Speaker based- Audio Identification and Segmentation for Broadcast News Recordings Indexation

机译:基于自动专用识别和广播新闻录音的音频识别和分割的开发与评估

获取原文

摘要

In this paper, we describe an automatic- speaker based- audio segmentation and identification system for broadcasted news indexation purposes. We specifically focus on speaker identification and audio scene detection. Speaker identification (SI) is based on the state of the art Gaussian mixture models, whereas scene change detection process uses the classical Bayesian Information Criteria (BIC) and the recently proposed DISTBIC algorithm. In this work, the effectiveness of Mel Frequency Cepstral coefficients MFCC, Linear Predictive Cepstral Coefficients LPCC, and Log Area Ratio LAR coefficients are compared for the purpose of text-independent speaker identification and speaker based audio segmentation. Both the Fisher Discrimination Ratio-feature analysis and performance evaluation in terms of correct identification rate on the TIMIT database showed that the LPCC outperforms the other features especially for low order coefficients. Our experiments on audio segmentation module showed that the DISTBIC segmentation technique is more accurate than the BIC procedure especially in the presence of short segments.
机译:在本文中,我们描述了一种基于自动扬声器的基于音频分割和识别系统,用于广播新闻编分的目的。我们专注于扬声器识别和音频场景检测。扬声器识别(SI)基于现有技术的高斯混合模型的状态,而场景变更检测过程使用经典贝叶斯信息标准(BIC)和最近提出的DISTBIC算法。在这项工作中,比较MEL频率谱系齐系数MFCC,线性预测谱系齐数LPCC和对数面积比LER系数的有效性,以便独立于独立于文本的扬声器识别和基于扬声器的音频分割。在Timit数据库上的正确识别率方面,Fisher辨别比率分析和性能评估都表明,LPCC概率尤其是低阶系数的其他功能。我们在音频分割模块上的实验表明,DISTBIC分段技术比BIC程序更准确,尤其是在短段存在下。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号