首页> 外文会议>International Conference on Intelligent Human Computer Interaction >Pitch based selection of optimal search space at runtime: Speaker recognition perspective
【24h】

Pitch based selection of optimal search space at runtime: Speaker recognition perspective

机译:基于音高的运行时最佳搜索空间选择:说话者识别角度

获取原文

摘要

Large scale speaker recognition (SR) applications demand efficient design strategy with smart optimization technique to enhance the real-time usability. Runtime selection of optimal search space can reduce the computational cost involved in this respect. This paper describes a multilayer design layout with a novel Pitch Based Dynamic Pruning (PBDP) algorithm to optimize VQ and GMM based close-set SR process. The process involves runtime selection of most likely speakers based on percentage of cumulative pitch occurrence frequencies within certain pitch ranges selected from the test utterance followed by a spectral matching using MFCC features within the reduced search space. Experiments on YOHO and NIST2008 corpus reveal that nearly 40% of the total identification time is being saved with slight (below 0.5%) increase or even decrease in average error rate. Proposed pruning method can also be applicable for selection of most likely flexible background in unconstrained cohort normalization task of verification problem.
机译:大型说话人识别(SR)应用程序需要采用智能优化技术的高效设计策略,以增强实时可用性。最佳搜索空间的运行时选择可以减少这方面的计算成本。本文介绍了一种基于新颖的基于间距的动态修剪(PBDP)算法的多层设计布局,该算法可优化基于VQ和GMM的闭合集SR流程。该过程涉及根据从测试话语中选择的某些音高范围内的累积音高发生频率的百分比来选择最可能的说话者,然后在缩小的搜索空间内使用MFCC功能进行频谱匹配。 YOHO和NIST2008语料库的实验表明,将节省近40%的总识别时间,平均错误率略有增加(甚至低于0.5%)。提出的修剪方法也可以适用于在验证问题的无约束队列归一化任务中选择最可能的灵活背景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号