首页> 外文会议>International Conference on Digital Signal Processing >Rank-based frame classification for usable speech detection in speaker identification systems
【24h】

Rank-based frame classification for usable speech detection in speaker identification systems

机译:用于说话人识别系统中可用语音检测的基于等级的帧分类

获取原文

摘要

The performance of a speaker identification (SID) system degrades substantially when there is a mismatch between the training and testing conditions. Discriminating between temporal sections of speech signals which are speech-like (SID usable) and noise-like (SID unusable) while only retaining frames labeled SID usable can augment SID performance substantially. In this paper, a novel labeling system for SID usable and SID unusable frames is presented for a GMM based SID system. This is motivated by a control experiment demonstrating that very high SID accuracies are theoretically achievable by removing frames that contribute more to the scores of competing speakers rather than the true speaker. To blindly identify these SID usable and unusable frames, the Mahalanobis distance and an ensemble of decision tree classifiers (with boosting) were trained on a dataset which was different from the enrollment database for the SID system. The classifier based techniques yielded improvements over the base speaker identification system (all frames used) in all cases when the speech signal was corrupted with additive white or additive pink noise.
机译:当训练条件与测试条件不匹配时,说话者识别(SID)系统的性能将大大降低。在仅保留标记为SID可用的帧的同时,区分语音信号的类似于语音(SID可用)和噪声(SID不可用)的时间部分可以大大提高SID性能。在本文中,针对基于GMM的SID系统,提出了一种新颖的SID可用和SID不可用帧标记系统。这是由一个控制实验所激发的,该实验表明,从理论上讲,通过删除对竞争说话者而不是真实说话者的得分贡献更大的帧,可以实现很高的SID精度。为了盲目地识别这些SID可用和不可用的帧,在不同于SID系统的注册数据库的数据集上训练了Mahalanobis距离和决策树分类器(带有增强)的整体。当语音信号被加性白色或加性粉红色噪声破坏时,基于分类器的技术在所有情况下均优于基本说话人识别系统(使用所有帧)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号