Pitch based selection of optimal search space at runtime: Speaker recognition perspective

机译：基于音高的运行时最佳搜索空间选择：说话者识别角度

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Large scale speaker recognition (SR) applications demand efficient design strategy with smart optimization technique to enhance the real-time usability. Runtime selection of optimal search space can reduce the computational cost involved in this respect. This paper describes a multilayer design layout with a novel Pitch Based Dynamic Pruning (PBDP) algorithm to optimize VQ and GMM based close-set SR process. The process involves runtime selection of most likely speakers based on percentage of cumulative pitch occurrence frequencies within certain pitch ranges selected from the test utterance followed by a spectral matching using MFCC features within the reduced search space. Experiments on YOHO and NIST2008 corpus reveal that nearly 40% of the total identification time is being saved with slight (below 0.5%) increase or even decrease in average error rate. Proposed pruning method can also be applicable for selection of most likely flexible background in unconstrained cohort normalization task of verification problem.

机译：大型说话人识别（SR）应用程序需要采用智能优化技术的高效设计策略，以增强实时可用性。最佳搜索空间的运行时选择可以减少这方面的计算成本。本文介绍了一种基于新颖的基于间距的动态修剪（PBDP）算法的多层设计布局，该算法可优化基于VQ和GMM的闭合集SR流程。该过程涉及根据从测试话语中选择的某些音高范围内的累积音高发生频率的百分比来选择最可能的说话者，然后在缩小的搜索空间内使用MFCC功能进行频谱匹配。 YOHO和NIST2008语料库的实验表明，将节省近40％的总识别时间，平均错误率略有增加（甚至低于0.5％）。提出的修剪方法也可以适用于在验证问题的无约束队列归一化任务中选择最可能的灵活背景。

著录项

来源
《International Conference on Intelligent Human Computer Interaction》|2012年|p.1-6|共6页
会议地点
作者
Khan Soma; Basu Joyanta; Bepari Milton S.; Roy Rajib;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类操作系统;操作系统;
关键词
frequent voicing activity zone; occurrence discarding threshold; pitch based dynamic pruning;

机译：频繁发声活动区;发生丢弃阈值;基于音高的动态修剪;

相似文献

外文文献
中文文献
专利

1. Gaussian-selection-based non-optimal search for speaker identification [J] . Roch M Speech Communication . 2006,第1期

机译：基于高斯选择的非最优搜索，用于说话人识别
2. Speaker indexing based on speaker model selection and automatic speech recognition in discussions [J] . Masafumi Nishida, Yuya Akita, Tatsuya Kawahara 電子情報通信学会技術研究報告. 音声. Speech . 2002,第530期

机译：讨论中基于说话人模型选择和自动语音识别的说话人索引
3. Speaker indexing based on speaker model selection and automatic speech recognition in discussions [J] . Masafumi Nishida, Yuya Akita, Tatsuya Kawahara 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2002,第528期

机译：讨论中基于说话人模型选择和自动语音识别的说话人索引
4. Pitch based selection of optimal search space at runtime: Speaker recognition perspective [C] . Khan Soma, Basu Joyanta, Bepari Milton S., International Conference on Intelligent Human Computer Interaction . 2012

机译：在运行时基于俯仰选择的最佳搜索空间：扬声器识别透视
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. A Search Method for Optimal Band Combination of Hyperspectral Imagery Based on Two Layers Selection Strategy [O] . Nian Chen, Kezhong Lu, Hao Zhou 2021

机译：基于两层选择策略的高光谱图像最佳频带组合的搜索方法
7. Classification of Pitch and Gender of Speakers for Forensic Speaker Recognition from Disguised Voices Using Novel Features Learned by Deep Convolutional Neural Networks [O] . Athulya M. Swamidasan Unni Nair, Sathidevi P. Savithri 2021

机译：使用深度卷积神经网络学习的新功能，从伪装的声音识别法医扬声器识别的讲话者的分类
8. Effect of Reference Set Selection on Speaker Dependent Speech Recognition. Frame Compression in Isolated Word Recognition [R] . Li, Z., Alleva, F., Reddy, R. 1981

机译：参考集选择对说话人相关语音识别的影响。孤立词识别中的帧压缩

Pitch based selection of optimal search space at runtime: Speaker recognition perspective

摘要

著录项

相似文献

相关主题

期刊订阅